Tag: awk

Parse multiple line CSV using PySpark , Python or Shell

Input (2 columns) : Note: Harry and Prof. does not have starting quotes Output (2 columns) What I tried (PySpark) ? Issue The above code worked fine where multiline had both start and end double quotes (For eg: row starting with Ronald) But it didnt work with rows where we only have end quotes but no start qu…

Remove duplicates from each cell

awk perl python

I have a file like this and need to remove duplicates in each cell without changing the order or format The missing data are noted as . (dot). So far I have tried with awk But it is killing the format. Is there any other way to do this ? Expected output Answer with sed

Prepend a url to all relative image links in a markdown document

awk markdown marko python sed

I have a bunch of markdown documents with a mix of relative and absolute image destinations. e.g. I want to prepend a URL to each of the relative images, e.g. to change the above to but preferably without hard-coding /sub/folder/ into the replace script (which is how I currently do it). Is there a clever way …

Building matrix with values from multiple files

awk python

I have multiple files where i need to create a matrix with matching values File_1, which is primary file contains all numbers tab delimited with one row There are multiple files where if a number matches, add 1 or else add 0 to file above File_2 File_3 Output Answer awk to the rescue!

Replace each reoccuring string value of a one line flat json to a random value using python

awk python

I have a JSON file (input.json) like the following which has 2 rows… (The real one has more than 1m rows) I want to basically change the value of each and every field with name four to a random value, so basically wherever it appears, it will remove what it currently have and change it to a random given…

Split CSV values on single row into individual rows

awk bash csv python text

I have a Python script that outputs a text file with thousands of random filenames in a comma separated list, all on a single row. I want to take each value in the list and put it into its own row in a new CSV file. I’ve tried some variations of awk with no success. What’s the best way to

How to completely erase the duplicated lines by linux tools?

awk grep python sed uniq

This question is not equal to How to print only the unique lines in BASH? because that ones suggests to remove all copies of the duplicated lines, while this one is about eliminating their duplicates only, i..e, change 1, 2, 3, 3 into 1, 2, 3 instead of just 1, 2. This question is really hard to write because…

What are the differences between Perl, Python, AWK and sed? [closed]

awk language-comparisons perl python sed

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, …