import re
import numpy as np
re.findall(pattern, text)
¶Returns a list containing all matches.
In the example below we try to extract string of three consecutive numbers.
text = 'Jeswin 619 George 999 98Y Y27J0 JK871'
re.findall(r'\d{3}', text)
['619', '999', '871']
Now to get the first or last patterns use indexing.
re.findall(r'\d{3}', text)[0]
'619'
re.findall(r'\d{3}', text)[-1]
'871'
re.sub(pattern, desired_string, text)
¶replaces the matches with the text of your choice.
text = 'Jeswin 619 George 999 98Y Y27J0 JK871'
To replace Y with string B.
re.sub('Y', "B", text)
'Jeswin 619 George 999 98B B27J0 JK871'
In the given text, ensure that the alpha-numeric words doesnt have alphabets and resulting numeric word must have only numerics and must have length of 3 (You can use any numeric of your choice).
output = []
for word in text.split():
if re.findall(r'\d', word):
if re.findall(r'[A-Z]',word):
word = re.sub(r'[A-Z]', '5', word)
if len(word)==3:
output.append(word)
elif len(word)<3:
while (3-len(word))<=0:
word += str(np.random.randomint(9))
output.append(word)
elif len(word)>3:
word = word[:3-len(word)]
output.append(word)
else:
output.append(word)
else:
output.append(word)
output
['Jeswin', '619', 'George', '999', '985', '527', '558']
Combining the list of words to a single sentence using string join().
' '.join(output)
'Jeswin 619 George 999 985 527 558'
text
'Jeswin 619 George 999 98Y Y27J0 JK871'