아아악!!! 사람 뭐만 쫌 해보려면 만나는 이놈의 정규표현식. 이제 구글 레퍼런스고 자시고 내가 그냥 책 펴서 공부해서 확 다 외워버리려다가!!
포스팅을 하기로 한번만 더 마음을 바꿨지요..
그러다 덤으로 정규표현식을 이용한 [화이트스페이스 제거방법]도 찾았는데.. 화이트 스페이스.. 처음듣는 단어였지만.. 왠지 마음이 먼저 알아들음..
"trim()아 그동안 고마웠다. 형이 결벽증이 좀 있어서.. 문자열 가위질 할때마다 맨날 찜찜했다. 트림이 나올것 같았어(몹쓸 '옛날개그'욕심)"
출처 : http://helloboy.tistory.com/entry/%EC%A0%95%EA%B7%9C-%ED%91%9C%ED%98%84%EC%8B%9D-%EC%98%88%EC%A0%9C1
텍스트내에 일치하는 패턴 : 기본형
1. Character literals/a/
Mary had a little lamb.
And everywhere that Mary
went, the lamb was sure
to go.
/Mary/
Mary had a little lamb.
And everywhere that Mary
went, the lamb was sure
to go.
2. "Escaped" characters literals
/.*/
Special characters must be escaped.*
/\.\*/
Special characters must be escaped.*
3. Positional special characters
/^Mary/
Mary had a little lamb.
And everywhere that Mary
went, the lamb was sure
to go.
/Mary$/
Mary had a little lamb.
And everywhere that Mary
went, the lamb was sure
to go.
4. The "wildcard" character
/.a/
Mary had a little lamb.
And everywhere that Mary
went, the lamb was sure
to go.
5. Grouping regular expressions
/(Mary)( )(had)/
Mary had a little lamb.
And everywhere that Mary
went, the lamb was sure
to go.
6. Character classes
/[a-z]a/
Mary had a little lamb.
And everywhere that Mary
went, the lamb was sure
to go.
7. Complement operator
/[^a-z]a/
Mary had a little lamb.
And everywhere that Mary went, the lamb was sureto go.
8. Alternation of patterns
/cat|dog|bird/
The pet store sold cats, dogs, and birds.
/=first|second=/
=first first= # =second second= # =first= # =second=
/(=)(first)|(second)(=)/
=first first= # =second second= # =first= # =second=
/=(first|second)=/
=first first= # =second second= # =first= # =second=
9. The basic abstract quantifier
/@(=+=)*@/
Match with zero in the middle: @@
Subexpresion occurs, but...: @=+=ABC@
Lots of occurrences: @=+==+==+==+==+=@
Must repeat entire pattern: @=+==+=+==+=@
텍스트내에 일치하는 패턴 : 중간형
1. More abstract quantifiers
/A+B*C?D/
AAAD
ABBBBCD
BBBCD
ABCCD
AAABBBC
2. Numeric quantifiers
/a{5} b{,6} c{4,8}/
aaaaa bbbbb ccccc
aaa bbb ccc
aaaaa bbbbbbbbbbbbbb ccccc
/a+ b{3,} c?/
aaaaa bbbbb ccccc
aaa bbb ccc
aaaaa bbbbbbbbbbbbbb ccccc
/a{5} b{6,} c{4,8}/
aaaaa bbbbb ccccc
aaa bbb ccc
aaaaa bbbbbbbbbbbbbb ccccc
3. Backreferences
/(abc|xyz) \1/
jkl abc xyz
jkl xyz abc
jkl abc abc
jkl xyz xyz
/(abc|xyz) (abc|xyz)/
jkl abc xyz
jkl xyz abc
jkl abc abc
jkl xyz xyz
4. Don't match more than you want to
/th.*s/
-- I want to match the words that start
-- with 'th' and end with 's'.
this
thus
thistle
this line matches too much
5. Tricks for restraining matches
/th[^s]*./
-- I want to match the words that start
-- with 'th' and end with 's'.
this
thus
thistle
this line matches too much
A literal-string modification example
s/cat/dog/g< The zoo had wild dogs, bobcats, lions, and other wild cats.
> The zoo had wild dogs, bobdogs, lions, and other wild dogs.
A pattern-match modification example
s/cat|dog/snake/g< The zoo had wild dogs, bobcats, lions, and other wild cats.
> The zoo had wild snakes, bobsnakes, lions, and other wild snakes.
s/[a-z]+i[a-z]*/nice/g
< The zoo had wild dogs, bobcats, lions, and other wild cats.
> The zoo had nice dogs, bobcats, nice, and other nice cats.
Modification using backreferences
s/([A-Z])([0-9]{2,4}) /\2:\1 /g< A37 B4 C107 D54112 E1103 XXX
> 37:A B4 107:C D54112 1103:E XXX
고급 정규 표현식의 확장
Non-greedy quantifiers
/th.*s/-- I want to match the words that start
-- with 'th' and end with 's'.
this line matches just right
this # thus # thistle
/th.*?s/
-- I want to match the words that start
-- with 'th' and end with 's'.
this # thus # thistle
this line matches just right
/th.*?s /
-- I want to match the words that start
-- with 'th' and end with 's'. (FINALLY!)
this # thus # thistle
this line matches just right
Pattern-match modifiers
/M.*[ise] /MAINE # Massachusetts # Colorado #
mississippi # Missouri # Minnesota #
/M.*[ise] /i
MAINE # Massachusetts # Colorado #
mississippi # Missouri # Minnesota #
/M.*[ise] /gis
MAINE # Massachusetts # Colorado #
mississippi # Missouri # Minnesota #
Changing backreference behavior
s/([A-Z])(?:-[a-z]{3}-)([0-9]*)/\1\2/g< A-xyz-37 # B:abcd:142 # C-wxy-66 # D-qrs-93
> A37 # B:abcd:42 # C66 # D93
Naming backreferences
import retxt = "A-xyz-37 # B:abcd:142 # C-wxy-66 # D-qrs-93"
print re.sub("(?P<prefix>[A-Z])(-[a-z]{3}-)(?P<id>[0-9]*)",
"\g<prefix>\g<id>", txt)
A37 # B:abcd:42 # C66 # D93
Lookahead assertions
s/([A-Z]-)(?=[a-z]{3})([a-z0-9]* )/\2\1/g< A-xyz37 # B-ab6142 # C-Wxy66 # D-qrs93
> xyz37A- # B-ab6142 # C-Wxy66 # qrs93D-
s/([A-Z]-)(?![a-z]{3})([a-z0-9]* )/\2\1/g
< A-xyz37 # B-ab6142 # C-Wxy66 # D-qrs93
> A-xyz37 # ab6142B- # Wxy66C- # D-qrs93
Making regular expressions more readable
/ # identify URLs within a text file[^="] # do not match URLs in IMG tags like:
# <img src="http://mysite.com/mypic.png">
http|ftp|gopher # make sure we find a resource type
:\/\/ # ...needs to be followed by colon-slash-slash
[^ \n\r]+ # stuff other than space, newline, tab is in URL
(?=[\s\.,]) # assert: followed by whitespace/period/comma
/
The URL for my site is: http://mysite.com/mydoc.html. You
might also enjoy ftp://yoursite.com/index.html for a good
place to download files.
'2_ 바삭바삭 프로그래밍 > C# and Visual C++' 카테고리의 다른 글
C# - MS Kinect SDK를 사용해 보자. 키넥트 개발 첫걸음 (1) | 2011.09.28 |
---|---|
C# - 드래그로 창 이동, 폼 접기 / 폼 펼치기 / 최소화 (1) | 2011.08.17 |
[C#] MP3 재생 프로그램 - Playing MP3 files with C# (9) | 2011.07.26 |
[C#] 드래그 앤 드롭 (1) | 2011.07.25 |
[C#] 글 읽어주는 메모장 만들기(Application to speak the text written in the textbox using C#.Net) (0) | 2011.07.21 |