!pip install -U spacy
Collecting spacy Downloading https://files.pythonhosted.org/packages/47/13/80ad28ef7a16e2a86d16d73e28588be5f1085afd3e85e4b9b912bd700e8a/spacy-2.2.3-cp36-cp36m-manylinux1_x86_64.whl (10.4MB) |████████████████████████████████| 10.4MB 4.2MB/s Requirement already satisfied, skipping upgrade: cymem<2.1.0,>=2.0.2 in /usr/local/lib/python3.6/dist-packages (from spacy) (2.0.3) Requirement already satisfied, skipping upgrade: srsly<1.1.0,>=0.1.0 in /usr/local/lib/python3.6/dist-packages (from spacy) (0.2.0) Requirement already satisfied, skipping upgrade: numpy>=1.15.0 in /usr/local/lib/python3.6/dist-packages (from spacy) (1.17.4) Collecting preshed<3.1.0,>=3.0.2 Downloading https://files.pythonhosted.org/packages/db/6b/e07fad36913879757c90ba03d6fb7f406f7279e11dcefc105ee562de63ea/preshed-3.0.2-cp36-cp36m-manylinux1_x86_64.whl (119kB) |████████████████████████████████| 122kB 42.2MB/s Collecting thinc<7.4.0,>=7.3.0 Downloading https://files.pythonhosted.org/packages/07/59/6bb553bc9a5f072d3cd479fc939fea0f6f682892f1f5cff98de5c9b615bb/thinc-7.3.1-cp36-cp36m-manylinux1_x86_64.whl (2.2MB) |████████████████████████████████| 2.2MB 30.3MB/s Requirement already satisfied, skipping upgrade: murmurhash<1.1.0,>=0.28.0 in /usr/local/lib/python3.6/dist-packages (from spacy) (1.0.2) Collecting catalogue<1.1.0,>=0.0.7 Downloading https://files.pythonhosted.org/packages/4f/d5/46ff975f0d7d055cf95557b944fd5d29d9dfb37a4341038e070f212b24fe/catalogue-0.0.8-py2.py3-none-any.whl Requirement already satisfied, skipping upgrade: setuptools in /usr/local/lib/python3.6/dist-packages (from spacy) (41.6.0) Requirement already satisfied, skipping upgrade: wasabi<1.1.0,>=0.4.0 in /usr/local/lib/python3.6/dist-packages (from spacy) (0.4.0) Requirement already satisfied, skipping upgrade: plac<1.2.0,>=0.9.6 in /usr/local/lib/python3.6/dist-packages (from spacy) (0.9.6) Requirement already satisfied, skipping upgrade: requests<3.0.0,>=2.13.0 in /usr/local/lib/python3.6/dist-packages (from spacy) (2.21.0) Collecting blis<0.5.0,>=0.4.0 Downloading https://files.pythonhosted.org/packages/41/19/f95c75562d18eb27219df3a3590b911e78d131b68466ad79fdf5847eaac4/blis-0.4.1-cp36-cp36m-manylinux1_x86_64.whl (3.7MB) |████████████████████████████████| 3.7MB 41.9MB/s Requirement already satisfied, skipping upgrade: tqdm<5.0.0,>=4.10.0 in /usr/local/lib/python3.6/dist-packages (from thinc<7.4.0,>=7.3.0->spacy) (4.28.1) Requirement already satisfied, skipping upgrade: importlib-metadata>=0.20; python_version < "3.8" in /usr/local/lib/python3.6/dist-packages (from catalogue<1.1.0,>=0.0.7->spacy) (0.23) Requirement already satisfied, skipping upgrade: certifi>=2017.4.17 in /usr/local/lib/python3.6/dist-packages (from requests<3.0.0,>=2.13.0->spacy) (2019.9.11) Requirement already satisfied, skipping upgrade: chardet<3.1.0,>=3.0.2 in /usr/local/lib/python3.6/dist-packages (from requests<3.0.0,>=2.13.0->spacy) (3.0.4) Requirement already satisfied, skipping upgrade: urllib3<1.25,>=1.21.1 in /usr/local/lib/python3.6/dist-packages (from requests<3.0.0,>=2.13.0->spacy) (1.24.3) Requirement already satisfied, skipping upgrade: idna<2.9,>=2.5 in /usr/local/lib/python3.6/dist-packages (from requests<3.0.0,>=2.13.0->spacy) (2.8) Requirement already satisfied, skipping upgrade: zipp>=0.5 in /usr/local/lib/python3.6/dist-packages (from importlib-metadata>=0.20; python_version < "3.8"->catalogue<1.1.0,>=0.0.7->spacy) (0.6.0) Requirement already satisfied, skipping upgrade: more-itertools in /usr/local/lib/python3.6/dist-packages (from zipp>=0.5->importlib-metadata>=0.20; python_version < "3.8"->catalogue<1.1.0,>=0.0.7->spacy) (7.2.0) Installing collected packages: preshed, blis, thinc, catalogue, spacy Found existing installation: preshed 2.0.1 Uninstalling preshed-2.0.1: Successfully uninstalled preshed-2.0.1 Found existing installation: blis 0.2.4 Uninstalling blis-0.2.4: Successfully uninstalled blis-0.2.4 Found existing installation: thinc 7.0.8 Uninstalling thinc-7.0.8: Successfully uninstalled thinc-7.0.8 Found existing installation: spacy 2.1.9 Uninstalling spacy-2.1.9: Successfully uninstalled spacy-2.1.9 Successfully installed blis-0.4.1 catalogue-0.0.8 preshed-3.0.2 spacy-2.2.3 thinc-7.3.1
!python -m spacy download el
Collecting el_core_news_sm==2.2.5 Downloading https://github.com/explosion/spacy-models/releases/download/el_core_news_sm-2.2.5/el_core_news_sm-2.2.5.tar.gz (11.4MB) |████████████████████████████████| 11.4MB 790kB/s Requirement already satisfied: spacy>=2.2.2 in /usr/local/lib/python3.6/dist-packages (from el_core_news_sm==2.2.5) (2.2.3) Requirement already satisfied: cymem<2.1.0,>=2.0.2 in /usr/local/lib/python3.6/dist-packages (from spacy>=2.2.2->el_core_news_sm==2.2.5) (2.0.3) Requirement already satisfied: numpy>=1.15.0 in /usr/local/lib/python3.6/dist-packages (from spacy>=2.2.2->el_core_news_sm==2.2.5) (1.17.4) Requirement already satisfied: catalogue<1.1.0,>=0.0.7 in /usr/local/lib/python3.6/dist-packages (from spacy>=2.2.2->el_core_news_sm==2.2.5) (0.0.8) Requirement already satisfied: setuptools in /usr/local/lib/python3.6/dist-packages (from spacy>=2.2.2->el_core_news_sm==2.2.5) (41.6.0) Requirement already satisfied: preshed<3.1.0,>=3.0.2 in /usr/local/lib/python3.6/dist-packages (from spacy>=2.2.2->el_core_news_sm==2.2.5) (3.0.2) Requirement already satisfied: thinc<7.4.0,>=7.3.0 in /usr/local/lib/python3.6/dist-packages (from spacy>=2.2.2->el_core_news_sm==2.2.5) (7.3.1) Requirement already satisfied: murmurhash<1.1.0,>=0.28.0 in /usr/local/lib/python3.6/dist-packages (from spacy>=2.2.2->el_core_news_sm==2.2.5) (1.0.2) Requirement already satisfied: wasabi<1.1.0,>=0.4.0 in /usr/local/lib/python3.6/dist-packages (from spacy>=2.2.2->el_core_news_sm==2.2.5) (0.4.0) Requirement already satisfied: plac<1.2.0,>=0.9.6 in /usr/local/lib/python3.6/dist-packages (from spacy>=2.2.2->el_core_news_sm==2.2.5) (0.9.6) Requirement already satisfied: requests<3.0.0,>=2.13.0 in /usr/local/lib/python3.6/dist-packages (from spacy>=2.2.2->el_core_news_sm==2.2.5) (2.21.0) Requirement already satisfied: blis<0.5.0,>=0.4.0 in /usr/local/lib/python3.6/dist-packages (from spacy>=2.2.2->el_core_news_sm==2.2.5) (0.4.1) Requirement already satisfied: srsly<1.1.0,>=0.1.0 in /usr/local/lib/python3.6/dist-packages (from spacy>=2.2.2->el_core_news_sm==2.2.5) (0.2.0) Requirement already satisfied: importlib-metadata>=0.20; python_version < "3.8" in /usr/local/lib/python3.6/dist-packages (from catalogue<1.1.0,>=0.0.7->spacy>=2.2.2->el_core_news_sm==2.2.5) (0.23) Requirement already satisfied: tqdm<5.0.0,>=4.10.0 in /usr/local/lib/python3.6/dist-packages (from thinc<7.4.0,>=7.3.0->spacy>=2.2.2->el_core_news_sm==2.2.5) (4.28.1) Requirement already satisfied: urllib3<1.25,>=1.21.1 in /usr/local/lib/python3.6/dist-packages (from requests<3.0.0,>=2.13.0->spacy>=2.2.2->el_core_news_sm==2.2.5) (1.24.3) Requirement already satisfied: chardet<3.1.0,>=3.0.2 in /usr/local/lib/python3.6/dist-packages (from requests<3.0.0,>=2.13.0->spacy>=2.2.2->el_core_news_sm==2.2.5) (3.0.4) Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.6/dist-packages (from requests<3.0.0,>=2.13.0->spacy>=2.2.2->el_core_news_sm==2.2.5) (2019.9.11) Requirement already satisfied: idna<2.9,>=2.5 in /usr/local/lib/python3.6/dist-packages (from requests<3.0.0,>=2.13.0->spacy>=2.2.2->el_core_news_sm==2.2.5) (2.8) Requirement already satisfied: zipp>=0.5 in /usr/local/lib/python3.6/dist-packages (from importlib-metadata>=0.20; python_version < "3.8"->catalogue<1.1.0,>=0.0.7->spacy>=2.2.2->el_core_news_sm==2.2.5) (0.6.0) Requirement already satisfied: more-itertools in /usr/local/lib/python3.6/dist-packages (from zipp>=0.5->importlib-metadata>=0.20; python_version < "3.8"->catalogue<1.1.0,>=0.0.7->spacy>=2.2.2->el_core_news_sm==2.2.5) (7.2.0) Building wheels for collected packages: el-core-news-sm Building wheel for el-core-news-sm (setup.py) ... done Created wheel for el-core-news-sm: filename=el_core_news_sm-2.2.5-cp36-none-any.whl size=11422786 sha256=a2e4fd3c86575b7c8ae7a4ec4211ae68a7c27a748573477b9b43a831742e2b44 Stored in directory: /tmp/pip-ephem-wheel-cache-634wxpxz/wheels/70/a1/c5/6690d6b524d87e287a8070cf957f834fb1b1665b9ede11348b Successfully built el-core-news-sm Installing collected packages: el-core-news-sm Successfully installed el-core-news-sm-2.2.5 ✔ Download and installation successful You can now load the model via spacy.load('el_core_news_sm') ✔ Linking successful /usr/local/lib/python3.6/dist-packages/el_core_news_sm --> /usr/local/lib/python3.6/dist-packages/spacy/data/el You can now load the model via spacy.load('el')
import spacy
#nlp = spacy.load("el_core_news_sm")
nlp = spacy.load("el")
sample_text="Αυτό είναι ένα παράδειγμα για την επεξεργασία κειμένου. Δημιουργήθηκε από το Δημήτρη Παναγόπουλο τον Νoέμβριο του 2019 στην Αθήνα. Μπορείτε να το τρέξετε στο Colab της Google"
doc = nlp(sample_text)
for token in doc:
print(token.text, token.lemma_, token.pos_)
Αυτό αυτό PRON είναι είναι AUX ένα ένα DET παράδειγμα παράδειγμα NOUN για για ADP την την DET επεξεργασία επεξεργασίας NOUN κειμένου κειμένο NOUN . . PUNCT Δημιουργήθηκε δημιουργήθηκε VERB από από ADP το το DET Δημήτρη δημήτρη NOUN Παναγόπουλο παναγόπουλο NOUN τον τον DET Νoέμβριο νoέμβριο NOUN του του DET 2019 2019 NUM στην στην ADJ Αθήνα Αθήνα PROPN . . PUNCT Μπορείτε μπορείτε VERB να να PART το το PRON τρέξετε τρέξω VERB στο στο ADV Colab colab X της της DET Google google X
from spacy import displacy
displacy.render(doc, style="ent", jupyter=True)
sample_words="σκύλος γάτα βασιλιάς"
tokens=nlp(sample_words)
print(tokens)
σκύλος γάτα βασιλιάς
print(tokens[0].similarity(tokens[1]))
/usr/lib/python3.6/runpy.py:193: ModelsWarning: [W007] The model you're using has no word vectors loaded, so the result of the Token.similarity method will be based on the tagger, parser and NER, which may not give useful similarity judgements. This may happen if you're using one of the small models, e.g. `en_core_web_sm`, which don't ship with word vectors and only use context-sensitive tensors. You can always add your own word vectors, or use one of the larger models instead if available. "__main__", mod_spec)
0.69062674
print(tokens[0].similarity(tokens[2]))
0.4917702
/usr/lib/python3.6/runpy.py:193: ModelsWarning: [W007] The model you're using has no word vectors loaded, so the result of the Token.similarity method will be based on the tagger, parser and NER, which may not give useful similarity judgements. This may happen if you're using one of the small models, e.g. `en_core_web_sm`, which don't ship with word vectors and only use context-sensitive tensors. You can always add your own word vectors, or use one of the larger models instead if available. "__main__", mod_spec)