\ 특수 문자 Escape (or start a sequence)
. 줄바꿈을 제외한 모든 문자 (참고 re.DOTALL)
^ 문자열의 시작 (참고 re.MULTILINE)
$ 문자열의 마지막 (re.MULTILINE)
[] 문자 집합
| 또는
() Capture 그룹 생성 (우선순위 지정)
[
' 문자 이후에 사용할 수 있는 특수 문자는 다음과 같다.¶] 집합의 끝
- 범위 (예 a-c 는 a, b 또는 c를 의미)
^ Negate를 의미
Quantifiers (append '?
' for non-greedy):
{m} Exactly m repetitions
{m,n} From m (default 0) to n (default infinity)
* 0 or more. Same as {,}
+ 1 or more. Same as {1,}
? 0 or 1. Same as {,1}
Special sequences::
\A Start of string
\b Match empty string at word (\w+) boundary
\B Match empty string not at word boundary
\d Digit
\D Non-digit
\s Whitespace - [ \t\n\r\f\v]과 동일 (참고: LOCALE, UNICODE)
\S Non-whitespace
\w Alphanumeric: [0-9a-zA-Z_], see LOCALE
\W Non-alphanumeric
\Z End of string
\g<id> Match prev named or numbered group,
'<' & '>' are literal, e.g. \g<0>
or \g<name> (not \g0 or \gname)
Special character escapes are much like those already escaped in Python string
literals. Hence regex '\n
' is same as regex '\\n
'::
\a ASCII Bell (BEL)
\f ASCII Formfeed
\n ASCII Linefeed
\r ASCII Carriage return
\t ASCII Tab
\v ASCII Vertical tab
\\ A single backslash
\xHH Two digit hexadecimal character goes here
\OOO Three digit octal char (or just use an
initial zero, e.g. \0, \09)
\DD Decimal number 1 to 99, match
previous numbered group
Extensions. Do not cause grouping, except 'P<name>
'::
(?iLmsux) Match empty string, sets re.X flags
(?:...) Non-capturing version of regular parens
(?P<name>...) Create a named capturing group.
(?P=name) Match whatever matched prev named group
(?#...) A comment; ignored.
(?=...) Lookahead assertion, match without consuming
(?!...) Negative lookahead assertion
(?<=...) Lookbehind assertion, match if preceded
(?<!...) Negative lookbehind assertion
(?(id)y|n) Match 'y' if group 'id' matched, else 'n'
Flags for re.compile(), etc. Combine with '|'
::
re.I == re.IGNORECASE Ignore case
re.L == re.LOCALE Make \w, \b, and \s locale dependent
re.M == re.MULTILINE Multiline
re.S == re.DOTALL Dot matches all (including newline)
re.U == re.UNICODE Make \w, \b, \d, and \s unicode dependent
re.X == re.VERBOSE Verbose (unescaped whitespace in pattern
is ignored, and '#' marks comment lines)
Module level functions::
compile(pattern[, flags]) -> RegexObject
match(pattern, string[, flags]) -> MatchObject
search(pattner, string[, flags]) -> MatchObject
findall(pattern, string[, flags]) -> list of strings
finditer(pattern, string[, flags]) -> iter of MatchObjects
split(pattern, string[, maxsplit, flags]) -> list of strings
sub(pattern, repl, string[, count, flags]) -> string
subn(pattern, repl, string[, count, flags]) -> (string, int)
escape(string) -> string
purge() # the re cache
RegexObjects (returned from compile()
)::
.match(string[, pos, endpos]) -> MatchObject
.search(string[, pos, endpos]) -> MatchObject
.findall(string[, pos, endpos]) -> list of strings
.finditer(string[, pos, endpos]) -> iter of MatchObjects
.split(string[, maxsplit]) -> list of strings
.sub(repl, string[, count]) -> string
.subn(repl, string[, count]) -> (string, int)
.flags # int, Passed to compile()
.groups # int, Number of capturing groups
.groupindex # {}, Maps group names to ints
.pattern # string, Passed to compile()
MatchObjects (returned from match()
and search()
)::
.expand(template) -> string, Backslash & group expansion
.group([group1...]) -> string or tuple of strings, 1 per arg
.groups([default]) -> tuple of all groups, non-matching=default
.groupdict([default]) -> {}, Named groups, non-matching=default
.start([group]) -> int, Start/end of substring match by group
.end([group]) -> int, Group defaults to 0, the whole match
.span([group]) -> tuple (match.start(group), match.end(group))
.pos int, Passed to search() or match()
.endpos int, "
.lastindex int, Index of last matched capturing group
.lastgroup string, Name of last matched capturing group
.re regex, As passed to search() or match()
.string string, "