bash - remove duplicates per line -


I have several CSVs that look like this:

I have many large text files ( CSV) There are unnecessary entries on some lines, this is the reason that the way they were merging a specific area, the same value would be twice or thrice. It is not always in the same order.

BWTL, Newsletter, Newsletter - BWTL, Newsletter, R2R, Newsletter
MPWJ, OTA Hot, Ota Hot, OTA Host - OTA Hot, ITOs, OTA Host

< P> etc. Entries that are next to each other are quite easy to clean with SAD

sed -i "/ NEWSLETTER, newsletter / newsletter / g" *. Csv

What is a quick way to fix the second duplicate?

You can do something like this

  sed -i "/ / \\\.: * Newsletter. * \), Newsletter / 1 / g 'eNewsletter.csv_new.csv  

It works by capturing everything and starts with another newsletter, ^ , whose vote Capture captures the line from that line \ ( and \) , and . <* / Code> indicates absolutely nothing to capture it again Changes the string of matching with part taken.


Comments

Popular posts from this blog

Eclipse CDT variable colors in editor -

AJAX doesn't send POST query -

wpf - Custom Message Box Advice -