In Python 2, codecs module can be used to handle unicode.

Strings are immutable objects in Python.
Any operation that looks like changing a string actually creates another new string.
Althoug it might sound inefficient, it's not due to garbage collection.

Recommandtion on using ' or "".
Use '' except the need to include ' in your string. e.g.

'Hi'
"Kight's"

A null byte will not terminate a string in Python(Not like C).
Python keeps both string's length and text in memory.

Common Used Operation¶

In [1]:

# Raw strings (no escape), commonly used for regular expression
r"\n\t"

Out[1]:

'\\n\\t'

In [2]:

# Remove whitespace
" qwe ".strip()

Out[2]:

'qwe'

In [3]:

# split on delimiter
"a,b,c".split(",")

Out[3]:

['a', 'b', 'c']

In [4]:

# replace
"123123123".replace("12", "ab")

Out[4]:

'ab3ab3ab3'

In [5]:

"123".isdigit()

Out[5]:

True

In [6]:

",".join(["a", "b", "c"])

Out[6]:

'a,b,c'

In [7]:

# step

s = "abcde"
s[::-1]

Out[7]:

'edcba'

Commonly Used Method¶

In [8]:

"abcabc".replace("ab", "de", 1)

Out[8]:

'decabc'

Concatenate¶

In [9]:

"a" + "b" + "c"

Out[9]:

'abc'

In [10]:

"".join(["a", "b", "c"])

Out[10]:

'abc'

The second method often run faster

String Formating¶

'...%s...' % (values)¶

%[(keyname)][flags][width][.precision]typecode

keyname: for indexing dict
flags
- - : left justification
- 0 : fill with 0
minimum width
precision

Key Name¶

It particularly useful when generating HTML or XML

In [11]:

"%(a)s---%(b)s" % {"a": 1, "b": 2}

Out[11]:

'1---2'

Flags¶

In [12]:

"%-6sabc" % "a"

Out[12]:

'a     abc'

In [13]:

"%06d" % 5

Out[13]:

'000005'

'...{}...'.format(values)¶

In [14]:

# By position

"{0} {1} {2}".format(1, 2, 3)

Out[14]:

'1 2 3'

In [15]:

# By keyword

"{a} {b}".format(a=1, b=2)

Out[15]:

'1 2'

In [16]:

# Use dict

d = {"a": 1, "b": 2}
"{a} {b}".format(**d)

Out[16]:

'1 2'

In [17]:

# Attribue

import sys

"{sys.platform}".format(sys=sys)

Out[17]:

'win32'

In [18]:

# Indexing

"{0[0]}".format(["abc", "def"])

Out[18]:

'abc'

Detail format¶

{fieldname component !conversionflag :formatspec}

Detail for formatspec

[[fill]align][sign][#][0][width][,][.precision][typdecode]

align
- > : left alignment
- < : right alignment
- = : padding after a sign character
- ^ : center alignment
,: comma for thousand separator

In [19]:

"{0:^20,d}".format(123456789)

Out[19]:

'    123,456,789     '

Compare the two method¶

format supports more features
- binary format

In [20]:

"{0:b}".format(2**16 - 1)

Out[20]:

'1111111111111111'

In [21]:

"{:,d}".format(123456789)

Out[21]:

'123,456,789'

General Type Categories¶

Numbers (interger, floating-point...)
Sequences (strings, lists, tuples)
Mappings (dictionaries)