I ran the following code in RStudio: It reads a huge NASA CSV file, converts it to a dataframe, converts each element to string, and adds them to a vector. RStudio took 4 min and 15 seconds. So I decided to implement the same code in Julia. I ran the following in VS Code: The result was good. The Julia
Tag: performance
Reading files faster in python
I’m writting a script to read a TXT file where each line is a Log entry and I need to separate this log in different files (for all Hor, Sia, Lmu). I’m reading each line and dividing in new files with no problem when using my test file (80kb), but when I try to apply to the actual file (177MB
Length of the intersections between a list an list of list
Note : almost duplicate of Numpy vectorization: Find intersection between list and list of lists Differences : I am focused on efficiently when the lists are large I’m searching for the largest intersections. Here are some assumptions: y is a list of ~500,000 sublist of ~500 elements each sublist in y is a range, so y is characterized by the
How to fix list comprehension ‘bitwise_and’ error and optimize for-loop?
I have the following for loop below, but I want to make this into a more computationally efficient variant. I thought I could do that with list comprehension, but this is giving me the following error: TypeError: ufunc ‘bitwise_and’ not supported for the input types, and the inputs could not be safely coerced to any supported types according to the
how to do hyperparameter optimization in large data?
I almost finished my time series model, collected enough data and now I am stuck at hyperparameter optimization. And after lots of googling I found new & good library called ultraopt, but problem is that how much amount of fragment of data should I use from my total data (~150 GB) for hyperparameter tuning. And I want to try lots of
How to create a script that gives me every combination possible of a six digit code
Me and a friend want to create a script that gives us every possible permutation of a six digit code, comprised of 36 alphanumeric characters (0-9, and a-z), in alphabetical order, then be able to see them in a .txt file. And I want it to use all of the CPU and RAM it can, so that it takes less
Itertools combinations, ¿How to make it faster?
I am coding this program that takes 54 (num1) numbers and puts them in a list. It then takes 16 (num2) of those numbers and forms a list that contains lists of 16 numbers chosen from all the combinations possible of “num1″c”num2”. It then takes those lists and generates 4×4 arrays. The code I have works, but running 54 numbers
FAST: 1D overlaps with rows in 2D?
let say i have 2D array, f.e.: I want to calculate overlap with 1D vector, FAST. I can almost do it with (8ms on big array): The problem with it is that it only matches if both Position and Value match. F.e. 5 in 2nd column of 1d vec did not match with 5 in 3rd column on the 2nd
Performance tuning: string wordcount in df
I have a df with column “free text”. I wish to count how many characters and words each cell has. Currently, I do it like this: Problem is, that it is pretty slow. I thought about using np.where but I wasn’t sure how. Would appreciate your help here. Answer IIUC: you can try via str.len() and str.count(): Sample dataframe used:
Pandas: efficiently inserting a large number of rows
I have a large dataframe in this format, call this df: index val1 val2 0 0.2 0.1 1 0.5 0.7 2 0.3 0.4 I have a row I will be inserting, call this myrow: index val1 val2 -1 0.9 0.9 I wish to insert this row 3 times after every row in the original dataframe, i.e.: index val1 val2 0