In Python 2, codecs module can be used to handle unicode.
Strings are immutable objects in Python.
Any operation that looks like changing a string actually creates another new string.
Althoug it might sound inefficient, it's not due to garbage collection.
Recommandtion on using ' or "".
Use '' except the need to include ' in your string.
e.g.
A null byte will not terminate a string in Python(Not like C).
Python keeps both string's length and text in memory.
# Raw strings (no escape), commonly used for regular expression
r"\n\t"
'\\n\\t'
# Remove whitespace
" qwe ".strip()
'qwe'
# split on delimiter
"a,b,c".split(",")
['a', 'b', 'c']
# replace
"123123123".replace("12", "ab")
'ab3ab3ab3'
"123".isdigit()
True
",".join(["a", "b", "c"])
'a,b,c'
# step
s = "abcde"
s[::-1]
'edcba'
"abcabc".replace("ab", "de", 1)
'decabc'
"a" + "b" + "c"
'abc'
"".join(["a", "b", "c"])
'abc'
The second method often run faster
It particularly useful when generating HTML or XML
"%(a)s---%(b)s" % {"a": 1, "b": 2}
'1---2'
"%-6sabc" % "a"
'a abc'
"%06d" % 5
'000005'
# By position
"{0} {1} {2}".format(1, 2, 3)
'1 2 3'
# By keyword
"{a} {b}".format(a=1, b=2)
'1 2'
# Use dict
d = {"a": 1, "b": 2}
"{a} {b}".format(**d)
'1 2'
# Attribue
import sys
"{sys.platform}".format(sys=sys)
'win32'
# Indexing
"{0[0]}".format(["abc", "def"])
'abc'
{fieldname component !conversionflag :formatspec}
Detail for formatspec
[[fill]align][sign][#][0][width][,][.precision][typdecode]
"{0:^20,d}".format(123456789)
' 123,456,789 '
"{0:b}".format(2**16 - 1)
'1111111111111111'
"{:,d}".format(123456789)
'123,456,789'