I have a list of dates and I want to get a difference from a defined one(I mean days) and append days calculated in a new column I get TypeError: unsupported operand type(s) for -: ‘DatetimeArray’ and ‘datetime.date’ Now how can I read the dates in csv file in the same format as the defined date is there a way
Tag: dataset
Problem to covert data from CoNLL format to spacy format
How can I covert data from CoNLL format to spacy format? I’ve executed current code following similar Q&A on stackoverflow: How to convert from CoNLL format to spacy format. CoNLL spacyformat However, I cannot fix the error. Code Error Message I’ve read the document, spacy convert, but have no idea how to fix the error. Environment Python 3.9.1 spaCy version
I’m trying to import CSV file using pandas, But I’m getting Error. (look at pic)
What am I doing wrong?? I’m trying to import a csv file using pandas, i either get an error stating file can’t be found or a unicodeerror message? Answer You should escape your backslashes on Windows – U is interpreted as a unicode character directive in the string. Try:
Error with using length function; output will not be anything other than one
I have multiple csv files I’ve uploaded into both google colab and jupyter notebook. I can successfully print certain lines of my file. The file contains rows of strings. When I open the file it opens the number application of my MacBook. Anyways, for some reason whenever I try to print the length of ANY line in my file, python
extracting images and their label one by one from ImageDataGenerator().flow_from_directory
so I imported my dataset(38 classes) for validation using ImageDataGenerator().flow_from_directory and i wanted to pick each image and its label one by one. For example i want to pick the first image and it’s label i tried this i get the image but for the label i just get an array of shape (32,38) with 0 and 1s Is there
Loading a large dataset from CSV files in TensorFlow
I use the following code to load a bunch of images in my data set in TensorFlow, which works well: I am wondering how I can use a similar code to load a bunch of CSV files. Each CSV file has a shape 256 x 256 and can be assumed as a grayscale image. I don’t know what I should
How to split parallel corpora while keeping alignment?
I have two text files containing parallel text in two languages (potentially millions of lines). I am trying to generate random train/validate/test files from that single file, as train_test_split does in sklearn. However when I try to import it into pandas using read_csv I get errors from many of the lines because of erroneous data in there and it would
Getting min and max datime for each date in csv
I’m kind of new to data science and Python. First of all, do you suggest using any other Library than pandas when dealing with huge dataset (100K+ rows)? Second of all, let me expose to you my current problem. I have a Dataset in which I have a Datetime column, to make it easy to understand, let’s say I only
Data Augmentation in PyTorch
I am a little bit confused about the data augmentation performed in PyTorch. Now, as far as I know, when we are performing data augmentation, we are KEEPING our original dataset, and then adding other versions of it (Flipping, Cropping…etc). But that doesn’t seem like happening in PyTorch. As far as I understood from the references, when we use data.transforms