String modification and sampling change

Question

i have this table: ID points values (x1;y1|x2;y2|x3;y3|x4;y4..........) 1 8 0,5;1|1;1,5|4;6|5;7|6;9|8;10|10;12|15;18 2 4 20;30|21;32|22;36|25;37 3 306 1;2|3;6|7;9|10;17|11;18|13;22|14;25|19;26|.. the points determine the number of points. It means for example - 306 (306 x points and 306 y points) My overall goal is to change the sampling density (the start and end points remain) - when i have 8 points, i want

Accepted Answer

It seems wasteful to create so many new dataframe columns, when many of the cells will be empty, and there is no relation between the values in any given column. More naturally, you could store each sample of points as a list containing pairs, all within one new column of the dataframe.To obtain the point lists, you can manipulate each values string to match the Python syntax and then pass it to eval(), if you can trust the data source to contain no malicious code.The sampling can then be done with Python&#8217;s slicing syntax, although it&#8217;s a bit tricky, because you want to include the first and last values.The above transformations can be defined as a function, so that you can easily apply them to each string in the values column:import pandas as pdfrom math import ceildf = pd.DataFrame({'ID': [1, 2, 3],                   'points': [8, 4, 306],                   'values': ['0,5;1|1;1,5|4;6|5;7|6;9|8;10|10;12|15;18',                              '20;30|21;32|22;36|25;37',                              '1;2|3;6|7;9|10;17|11;18|13;22|14;25|19;26']})def list_sample(s):    """    Convert string s to a list of value pairs     and return the list with every other pair left out    (but may leave no or double gap in the middle,     to always include the last pair).    """    pair_string = '[(' + s.replace(',', '.').replace(        ';', ',').replace('|', '), (') + ')]'    pair_list = eval(pair_string)    mid = ceil(len(pair_list) / 2)    return pair_list[:mid:2] + list(reversed(pair_list[-1:(mid-1):-2]))df['sample'] = df['values'].apply(list_sample)df  ID points values                                    sample0 1  8      0,5;1|1;1,5|4;6|5;7|6;9|8;10|10;12|15;18  [(0.5, 1), (4, 6), (8, 10), (15, 18)]1 2  4      20;30|21;32|22;36|25;37                   [(20, 30), (25, 37)]2 3  306    1;2|3;6|7;9|10;17|11;18|13;22|14;25|19;26 [(1, 2), (7, 9), (13, 22), (19, 26)]

ID	points	values (x1;y1\|x2;y2\|x3;y3\|x4;y4……….)
1	8	0,5;1\|1;1,5\|4;6\|5;7\|6;9\|8;10\|10;12\|15;18
2	4	20;30\|21;32\|22;36\|25;37
3	306	1;2\|3;6\|7;9\|10;17\|11;18\|13;22\|14;25\|19;26\|..

ID	points	values (x1;y1\|x2;y2\|x3;y3\|x4;y4……)	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15	16
1	8	0,5;1\|1;1,5\|4;6\|5;7\|6;9\|8;10\|10;12\|15;18	0,5	1	1	1,5	4	6	5	7	6	9	8	10	10	12	15	18
..	…	..	……………………………..	..	..	.	.	.	.	.	.	.	.	.	.	.	.	.

ID	points	values (x1;y1\|x2;y2\|x3;y3\|x4;y4……)	x1	y1	x2	y2	x3	y3	x4	y4
1	8	0,5;1\|1;1,5\|4;6\|5;7\|6;9\|8;10\|10;12\|15;18	0,5	1	4	6	8	10	15	18
..	……	……………………………………..	.	.	.	.	.	.	.	.

Advertisement

Answer