Python regex string to list of words (including words with hyphens) -


I would like to parse a string containing all the words (including the hyphenated word, too). The current code is:

  s = '-his is. A - sentence; One word recompile ("\ W +", again .UNICODE) .split (s)  

returns:

  [ 

and I have to return it:

  ['', 'this',' is', ' A ',' sentence ',' one word ']  

If you have a leading empty string Is not required, you can use pattern \ w (?: [- \ w] * \ w)? For match :

  & gt; & Gt; & Gt; Import re & gt; & Gt; & Gt; S = '- this is a - sentence; One word '& gt; & Gt; & Gt; Rx = re.compile (r '\ w (?: [- \ w] * \ w)?') & Gt; & Gt; & Gt; <'This', 'is', 'A', 'sentence', 'one word']  

Note that this word will not match apostrophes like will not be .


Comments

Popular posts from this blog

Eclipse CDT variable colors in editor -

AJAX doesn't send POST query -

wpf - Custom Message Box Advice -