Python regex string to list of words (including words with hyphens) -
I would like to parse a string containing all the words (including the hyphenated word, too). The current code is:
s = '-his is. A - sentence; One word recompile ("\ W +", again .UNICODE) .split (s)
returns:
[
and I have to return it:
['', 'this',' is', ' A ',' sentence ',' one word ']
If you have a leading empty string Is not required, you can use pattern \ w (?: [- \ w] * \ w)? For
match :
& gt; & Gt; & Gt; Import re & gt; & Gt; & Gt; S = '- this is a - sentence; One word '& gt; & Gt; & Gt; Rx = re.compile (r '\ w (?: [- \ w] * \ w)?') & Gt; & Gt; & Gt; <'This', 'is', 'A', 'sentence', 'one word']
Note that this word will not match apostrophes like will not be
.
Comments
Post a Comment