Tag: performance

Why is this task faster in Python than Julia?

I ran the following code in RStudio: It reads a huge NASA CSV file, converts it to a dataframe, converts each element to string, and adds them to a vector. RStudio took 4 min and 15 seconds. So I decided to implement the same code in Julia. I ran the following in VS Code: The result was good. The Julia

Reading files faster in python

file performance python

I’m writting a script to read a TXT file where each line is a Log entry and I need to separate this log in different files (for all Hor, Sia, Lmu). I’m reading each line and dividing in new files with no problem when using my test file (80kb), but when I try to apply to the actual file (177MB

Length of the intersections between a list an list of list

algorithm performance python set-intersection

Note : almost duplicate of Numpy vectorization: Find intersection between list and list of lists Differences : I am focused on efficiently when the lists are large I’m searching for the largest intersections. Here are some assumptions: y is a list of ~500,000 sublist of ~500 elements each sublist in y is a range, so y is characterized by the

how to do hyperparameter optimization in large data?

hyperparameters large-data machine-learning performance python

I almost finished my time series model, collected enough data and now I am stuck at hyperparameter optimization. And after lots of googling I found new & good library called ultraopt, but problem is that how much amount of fragment of data should I use from my total data (~150 GB) for hyperparameter tuning. And I want to try lots of

How to create a script that gives me every combination possible of a six digit code

combinations performance python

Me and a friend want to create a script that gives us every possible permutation of a six digit code, comprised of 36 alphanumeric characters (0-9, and a-z), in alphabetical order, then be able to see them in a .txt file. And I want it to use all of the CPU and RAM it can, so that it takes less

Itertools combinations, ¿How to make it faster?

combinations performance python python-itertools

I am coding this program that takes 54 (num1) numbers and puts them in a list. It then takes 16 (num2) of those numbers and forms a list that contains lists of 16 numbers chosen from all the combinations possible of “num1″c”num2”. It then takes those lists and generates 4×4 arrays. The code I have works, but running 54 numbers

FAST: 1D overlaps with rows in 2D?

arrays numpy overlap performance python

let say i have 2D array, f.e.: I want to calculate overlap with 1D vector, FAST. I can almost do it with (8ms on big array): The problem with it is that it only matches if both Position and Value match. F.e. 5 in 2nd column of 1d vec did not match with 5 in 3rd column on the 2nd

Performance tuning: string wordcount in df

pandas performance python

I have a df with column “free text”. I wish to count how many characters and words each cell has. Currently, I do it like this: Problem is, that it is pretty slow. I thought about using np.where but I wasn’t sure how. Would appreciate your help here. Answer IIUC: you can try via str.len() and str.count(): Sample dataframe used:

Pandas: efficiently inserting a large number of rows

dataframe numpy pandas performance python

I have a large dataframe in this format, call this df: index val1 val2 0 0.2 0.1 1 0.5 0.7 2 0.3 0.4 I have a row I will be inserting, call this myrow: index val1 val2 -1 0.9 0.9 I wish to insert this row 3 times after every row in the original dataframe, i.e.: index val1 val2 0