We used machine translation model from The VISTEC-depa Thailand Artificial Intelligence Research Institute.
!pip install fairseq
Collecting fairseq Downloading https://files.pythonhosted.org/packages/15/ab/92c6efb05ffdfe16fbdc9e463229d9af8c3b74dc943ed4b4857a87b223c2/fairseq-0.10.2-cp37-cp37m-manylinux1_x86_64.whl (1.7MB) |████████████████████████████████| 1.7MB 5.7MB/s Collecting dataclasses Downloading https://files.pythonhosted.org/packages/26/2f/1095cdc2868052dd1e64520f7c0d5c8c550ad297e944e641dbf1ffbb9a5d/dataclasses-0.6-py3-none-any.whl Requirement already satisfied: cython in /usr/local/lib/python3.7/dist-packages (from fairseq) (0.29.22) Collecting hydra-core Downloading https://files.pythonhosted.org/packages/52/e3/fbd70dd0d3ce4d1d75c22d56c0c9f895cfa7ed6587a9ffb821d6812d6a60/hydra_core-1.0.6-py3-none-any.whl (123kB) |████████████████████████████████| 133kB 14.5MB/s Requirement already satisfied: cffi in /usr/local/lib/python3.7/dist-packages (from fairseq) (1.14.5) Collecting sacrebleu>=1.4.12 Downloading https://files.pythonhosted.org/packages/7e/57/0c7ca4e31a126189dab99c19951910bd081dea5bbd25f24b77107750eae7/sacrebleu-1.5.1-py3-none-any.whl (54kB) |████████████████████████████████| 61kB 6.3MB/s Requirement already satisfied: tqdm in /usr/local/lib/python3.7/dist-packages (from fairseq) (4.41.1) Requirement already satisfied: torch in /usr/local/lib/python3.7/dist-packages (from fairseq) (1.8.0+cu101) Requirement already satisfied: regex in /usr/local/lib/python3.7/dist-packages (from fairseq) (2019.12.20) Requirement already satisfied: numpy in /usr/local/lib/python3.7/dist-packages (from fairseq) (1.19.5) Collecting omegaconf<2.1,>=2.0.5 Downloading https://files.pythonhosted.org/packages/d0/eb/9d63ce09dd8aa85767c65668d5414958ea29648a0eec80a4a7d311ec2684/omegaconf-2.0.6-py3-none-any.whl Collecting antlr4-python3-runtime==4.8 Downloading https://files.pythonhosted.org/packages/56/02/789a0bddf9c9b31b14c3e79ec22b9656185a803dc31c15f006f9855ece0d/antlr4-python3-runtime-4.8.tar.gz (112kB) |████████████████████████████████| 112kB 18.4MB/s Requirement already satisfied: importlib-resources; python_version < "3.9" in /usr/local/lib/python3.7/dist-packages (from hydra-core->fairseq) (5.1.2) Requirement already satisfied: pycparser in /usr/local/lib/python3.7/dist-packages (from cffi->fairseq) (2.20) Collecting portalocker==2.0.0 Downloading https://files.pythonhosted.org/packages/89/a6/3814b7107e0788040870e8825eebf214d72166adf656ba7d4bf14759a06a/portalocker-2.0.0-py2.py3-none-any.whl Requirement already satisfied: typing-extensions in /usr/local/lib/python3.7/dist-packages (from torch->fairseq) (3.7.4.3) Collecting PyYAML>=5.1.* Downloading https://files.pythonhosted.org/packages/7a/a5/393c087efdc78091afa2af9f1378762f9821c9c1d7a22c5753fb5ac5f97a/PyYAML-5.4.1-cp37-cp37m-manylinux1_x86_64.whl (636kB) |████████████████████████████████| 645kB 18.0MB/s Requirement already satisfied: zipp>=0.4; python_version < "3.8" in /usr/local/lib/python3.7/dist-packages (from importlib-resources; python_version < "3.9"->hydra-core->fairseq) (3.4.1) Building wheels for collected packages: antlr4-python3-runtime Building wheel for antlr4-python3-runtime (setup.py) ... done Created wheel for antlr4-python3-runtime: filename=antlr4_python3_runtime-4.8-cp37-none-any.whl size=141231 sha256=7443fbcc47b93d3b320b897cf91d8b947b6fdc6a0795dcce01ed16fd31c8ab6d Stored in directory: /root/.cache/pip/wheels/e3/e2/fa/b78480b448b8579ddf393bebd3f47ee23aa84c89b6a78285c8 Successfully built antlr4-python3-runtime Installing collected packages: dataclasses, PyYAML, omegaconf, antlr4-python3-runtime, hydra-core, portalocker, sacrebleu, fairseq Found existing installation: PyYAML 3.13 Uninstalling PyYAML-3.13: Successfully uninstalled PyYAML-3.13 Successfully installed PyYAML-5.4.1 antlr4-python3-runtime-4.8 dataclasses-0.6 fairseq-0.10.2 hydra-core-1.0.6 omegaconf-2.0.6 portalocker-2.0.0 sacrebleu-1.5.1
!pip install sacremoses sentencepiece
Requirement already satisfied: sacremoses in /usr/local/lib/python3.7/dist-packages (0.0.43) Collecting sentencepiece Downloading https://files.pythonhosted.org/packages/f5/99/e0808cb947ba10f575839c43e8fafc9cc44e4a7a2c8f79c60db48220a577/sentencepiece-0.1.95-cp37-cp37m-manylinux2014_x86_64.whl (1.2MB) |████████████████████████████████| 1.2MB 4.3MB/s Requirement already satisfied: tqdm in /usr/local/lib/python3.7/dist-packages (from sacremoses) (4.41.1) Requirement already satisfied: click in /usr/local/lib/python3.7/dist-packages (from sacremoses) (7.1.2) Requirement already satisfied: regex in /usr/local/lib/python3.7/dist-packages (from sacremoses) (2019.12.20) Requirement already satisfied: joblib in /usr/local/lib/python3.7/dist-packages (from sacremoses) (1.0.1) Requirement already satisfied: six in /usr/local/lib/python3.7/dist-packages (from sacremoses) (1.15.0) Installing collected packages: sentencepiece Successfully installed sentencepiece-0.1.95
!pip install https://github.com/PyThaiNLP/pythainlp/archive/dev.zip
Collecting https://github.com/PyThaiNLP/pythainlp/archive/dev.zip Using cached https://github.com/PyThaiNLP/pythainlp/archive/dev.zip Requirement already satisfied (use --upgrade to upgrade): pythainlp==2.3.0.dev0 from https://github.com/PyThaiNLP/pythainlp/archive/dev.zip in /usr/local/lib/python3.7/dist-packages Requirement already satisfied: python-crfsuite>=0.9.6 in /usr/local/lib/python3.7/dist-packages (from pythainlp==2.3.0.dev0) (0.9.7) Requirement already satisfied: requests>=2.22.0 in /usr/local/lib/python3.7/dist-packages (from pythainlp==2.3.0.dev0) (2.23.0) Requirement already satisfied: tinydb>=3.0 in /usr/local/lib/python3.7/dist-packages (from pythainlp==2.3.0.dev0) (4.4.0) Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.7/dist-packages (from requests>=2.22.0->pythainlp==2.3.0.dev0) (2020.12.5) Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /usr/local/lib/python3.7/dist-packages (from requests>=2.22.0->pythainlp==2.3.0.dev0) (1.24.3) Requirement already satisfied: chardet<4,>=3.0.2 in /usr/local/lib/python3.7/dist-packages (from requests>=2.22.0->pythainlp==2.3.0.dev0) (3.0.4) Requirement already satisfied: idna<3,>=2.5 in /usr/local/lib/python3.7/dist-packages (from requests>=2.22.0->pythainlp==2.3.0.dev0) (2.10) Building wheels for collected packages: pythainlp Building wheel for pythainlp (setup.py) ... done Created wheel for pythainlp: filename=pythainlp-2.3.0.dev0-cp37-none-any.whl size=11003566 sha256=b64ebc4010c51f2644c15473edd0c49540644725a367c28baa0d3f3e19edcccb Stored in directory: /tmp/pip-ephem-wheel-cache-zkojv2_o/wheels/79/4e/1e/26f3198c6712ecfbee92928ed1dde923a078da3d222401cc78 Successfully built pythainlp
All download and install model
from pythainlp.translate import download_model_all
download_model_all()
Corpus: scb_1m_en-th_moses - Downloading: scb_1m_en-th_moses 1.0
100%|██████████| 1174648148/1174648148 [00:14<00:00, 81506882.14it/s]
Corpus: scb_1m_th-en_spm - Downloading: scb_1m_th-en_spm 1.0
100%|██████████| 703780432/703780432 [00:08<00:00, 78234386.81it/s]
from pythainlp.translate import EnThTranslator, ThEnTranslator
EnThTranslator/ThEnTranslator.translate(text)
th
is Thai languageen
is English languageWe have 1 model
scb_1m_en-th_moses
- bpe
tokenizerprint(EnThTranslator().translate("I want fried chicken."))
ไก่ทอดค่ะ
print(ThEnTranslator().translate("ผมอยากกินไก่ทอด"))
I want fried chicken.
print(ThEnTranslator().translate("ผมอยากเขียนโปรแกรมคอมพิวเตอร์"))
I want to write a computer program.