!pip install gTTS
Collecting gTTS
Downloading gTTS-2.2.3-py3-none-any.whl (25 kB)
Requirement already satisfied: requests in /usr/local/lib/python3.7/dist-packages (from gTTS) (2.23.0)
Requirement already satisfied: click in /usr/local/lib/python3.7/dist-packages (from gTTS) (7.1.2)
Requirement already satisfied: six in /usr/local/lib/python3.7/dist-packages (from gTTS) (1.15.0)
Requirement already satisfied: chardet<4,>=3.0.2 in /usr/local/lib/python3.7/dist-packages (from requests->gTTS) (3.0.4)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.7/dist-packages (from requests->gTTS) (2021.10.8)
Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /usr/local/lib/python3.7/dist-packages (from requests->gTTS) (1.24.3)
Requirement already satisfied: idna<3,>=2.5 in /usr/local/lib/python3.7/dist-packages (from requests->gTTS) (2.10)
Installing collected packages: gTTS
Successfully installed gTTS-2.2.3
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
from gtts import gTTS
from IPython.display import Audio
tts = gTTS('hola David como estás',lang='es')
tts.save('1.wav')
sound_file = '1.wav'
Audio(sound_file, autoplay=True)
from gtts import gTTS
from IPython.display import Audio
tts = gTTS('hello David how are you?',lang='en')
tts.save('1.wav')
sound_file = '1.wav'
Audio(sound_file, autoplay=True)
import requests
from bs4 import BeautifulSoup
# Ciudad de Argentina
city = "Buenos Aires"
# creating url and requests instance
url = "https://www.google.com/search?q="+"weather"+city
html = requests.get(url).content
# Obtener la data
soup = BeautifulSoup(html, 'html.parser')
temp = soup.find('div', attrs={'class': 'BNeawe iBp4i AP7Wnd'}).text
str = soup.find('div', attrs={'class': 'BNeawe tAd8D AP7Wnd'}).text
# Formatear la data
data = str.split('\n')
time = data[0]
sky = data[1]
# Obtener todos los tags tipo div
listdiv = soup.findAll('div', attrs={'class': 'BNeawe s3v9rd AP7Wnd'})
strd = listdiv[5].text
# Obtener otros datos interesantes
pos = strd.find('Wind')
other_data = strd[pos:]
# Mostrar la data
print("Temperatura es", temp)
print("Tiempo: ", time)
print("Descripcion del cielo: ", sky)
print(other_data)
Temperatura es 74°F Tiempo: Tuesday 7:44 PM Descripcion del cielo: Wind and rain .
!sudo apt install tesseract-ocr
!pip install pytesseract
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following additional packages will be installed:
tesseract-ocr-eng tesseract-ocr-osd
The following NEW packages will be installed:
tesseract-ocr tesseract-ocr-eng tesseract-ocr-osd
0 upgraded, 3 newly installed, 0 to remove and 37 not upgraded.
Need to get 4,795 kB of archives.
After this operation, 15.8 MB of additional disk space will be used.
Get:1 http://archive.ubuntu.com/ubuntu bionic/universe amd64 tesseract-ocr-eng all 4.00~git24-0e00fe6-1.2 [1,588 kB]
Get:2 http://archive.ubuntu.com/ubuntu bionic/universe amd64 tesseract-ocr-osd all 4.00~git24-0e00fe6-1.2 [2,989 kB]
Get:3 http://archive.ubuntu.com/ubuntu bionic/universe amd64 tesseract-ocr amd64 4.00~git2288-10f4998a-2 [218 kB]
Fetched 4,795 kB in 1s (5,033 kB/s)
debconf: unable to initialize frontend: Dialog
debconf: (No usable dialog-like program is installed, so the dialog based frontend cannot be used. at /usr/share/perl5/Debconf/FrontEnd/Dialog.pm line 76, <> line 3.)
debconf: falling back to frontend: Readline
debconf: unable to initialize frontend: Readline
debconf: (This frontend requires a controlling tty.)
debconf: falling back to frontend: Teletype
dpkg-preconfigure: unable to re-open stdin:
Selecting previously unselected package tesseract-ocr-eng.
(Reading database ... 155229 files and directories currently installed.)
Preparing to unpack .../tesseract-ocr-eng_4.00~git24-0e00fe6-1.2_all.deb ...
Unpacking tesseract-ocr-eng (4.00~git24-0e00fe6-1.2) ...
Selecting previously unselected package tesseract-ocr-osd.
Preparing to unpack .../tesseract-ocr-osd_4.00~git24-0e00fe6-1.2_all.deb ...
Unpacking tesseract-ocr-osd (4.00~git24-0e00fe6-1.2) ...
Selecting previously unselected package tesseract-ocr.
Preparing to unpack .../tesseract-ocr_4.00~git2288-10f4998a-2_amd64.deb ...
Unpacking tesseract-ocr (4.00~git2288-10f4998a-2) ...
Setting up tesseract-ocr-osd (4.00~git24-0e00fe6-1.2) ...
Setting up tesseract-ocr-eng (4.00~git24-0e00fe6-1.2) ...
Setting up tesseract-ocr (4.00~git2288-10f4998a-2) ...
Processing triggers for man-db (2.8.3-2ubuntu0.1) ...
Collecting pytesseract
Downloading pytesseract-0.3.8.tar.gz (14 kB)
Preparing metadata (setup.py) ... done
Requirement already satisfied: Pillow in /usr/local/lib/python3.7/dist-packages (from pytesseract) (7.1.2)
Building wheels for collected packages: pytesseract
Building wheel for pytesseract (setup.py) ... done
Created wheel for pytesseract: filename=pytesseract-0.3.8-py2.py3-none-any.whl size=14070 sha256=ff3003458334386a624855e9a51deea055d0b33dd8cebcef0b66da7c70a5f500
Stored in directory: /root/.cache/pip/wheels/a4/89/b9/3f11250225d0f90e5454fcc30fd1b7208db226850715aa9ace
Successfully built pytesseract
Installing collected packages: pytesseract
Successfully installed pytesseract-0.3.8
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
from google.colab import drive
import os
drive.mount('/content/gdrive')
# Establecer ruta de acceso en dr
import os
print(os.getcwd())
os.chdir("/content/gdrive/My Drive")
Drive already mounted at /content/gdrive; to attempt to forcibly remount, call drive.mount("/content/gdrive", force_remount=True). /content/gdrive/My Drive
Image.open('imagen_prueba.jpg')
import pytesseract
from PIL import Image
text = pytesseract.image_to_string(Image.open('imagen_prueba.jpg'))
print(text)
| [How To WRITE ALT | TEXT AND IMAGE DESCRIPTIONS FOR THE VISUALLY IMPAIRED
!sudo apt-get install tesseract-ocr-spa
Reading package lists... Done Building dependency tree Reading state information... Done The following NEW packages will be installed: tesseract-ocr-spa 0 upgraded, 1 newly installed, 0 to remove and 37 not upgraded. Need to get 951 kB of archives. After this operation, 2,309 kB of additional disk space will be used. Get:1 http://archive.ubuntu.com/ubuntu bionic/universe amd64 tesseract-ocr-spa all 4.00~git24-0e00fe6-1.2 [951 kB] Fetched 951 kB in 1s (1,464 kB/s) debconf: unable to initialize frontend: Dialog debconf: (No usable dialog-like program is installed, so the dialog based frontend cannot be used. at /usr/share/perl5/Debconf/FrontEnd/Dialog.pm line 76, <> line 1.) debconf: falling back to frontend: Readline debconf: unable to initialize frontend: Readline debconf: (This frontend requires a controlling tty.) debconf: falling back to frontend: Teletype dpkg-preconfigure: unable to re-open stdin: Selecting previously unselected package tesseract-ocr-spa. (Reading database ... 155276 files and directories currently installed.) Preparing to unpack .../tesseract-ocr-spa_4.00~git24-0e00fe6-1.2_all.deb ... Unpacking tesseract-ocr-spa (4.00~git24-0e00fe6-1.2) ... Setting up tesseract-ocr-spa (4.00~git24-0e00fe6-1.2) ...
text1 = pytesseract.image_to_string(Image.open('imagen_prueba1.jpg'),
lang="spa")
print(text1)
LOS LEONES El león es uno de los felinos mos grandes del mundo. El león es un mamífero. Un mamífero quere decir (que tiene sangre tibia y pelaje. El león come carne porque es un 'carnivoro. Elleón tiene una melena 'grande y peluda pora poder asustar ¡aotros animales. Elleón puede correr muy rápido para poder segur su presa. Los leones viven en grupos lamados manadas,