Skip to content
Advertisement

How to check if data’s of two files are matching by checking column wise?

I have two files for eg input.txt and output.txt.

input.txt

000000008b8c5200 bcbiubcbueqihd167
0000000056fb2620 akbhjgbfre34168gf
0000000045ab4630 jbshjdb7usvyysdu8
00000000672fa001 jhwdggdiuqwhuhius

output.txt

00000000af16380f avhdhjwqdjdkjdnxk
0000000056fb2620 akbhjgbfre34168gf
0000000045ab4630 jbshjdb7usvyysdu8
000000008b8c5200 bdqjhwdjhfjjehfiu

i have to compare these two files such a way that if first column between two files matches then i have to check for 2nd column and if both matches then it’s okay and if second column doesn’t match then i have to write that data to a file.

that is

if 00000000af16380f is not present in output.txt then i have to write that row in to a file named unmatch.txt.

unmatch.txt:(if address of input.txt doesn’t matches with output.txt)

00000000af16380f avhdhjwqdjdkjdnxk

and if address matches for eg: input.txt and output.txt both have same address then i need to check if thier datas matches. if data is not matching i need to write it in to another file un_data.txt.

un_data.txt

000000008b8c5200 bcbiubcbueqihd167
000000008b8c5200 bdqjhwdjhfjjehfiu

how can i do this?

this is current code:

file_1 = None
file_2 = None
with open("input.txt) as f_1:
     file_1 = [line.strip() for line in f_1.readlines()]
with open(:output.txt)as f_2:
     file_2 = [line.strip() for line in f_2.readlines()]
file_3 = []
for line in file_1:
    if line not in file_2:
       file_3.append(line)
file_3 = 'n'.join(file_3)
with open("data.txt", "w") as f_3:
f_3.write(file_3)

Advertisement

Answer

Based on the information provided, here is what I came up with:
Code has been commented for further explanation:

# ------------------------------------------------------
# IMPORT the text files
# ------------------------------------------------------
file_1 = None
file_2 = None
with open("input.txt") as f_1:
     file_1 = [line.strip() for line in f_1.readlines()]
with open("output.txt")as f_2:
     file_2 = [line.strip() for line in f_2.readlines()]

#-------------------------------------------------------
# Function for storing list into a file
# ------------------------------------------------------
def list_to_file(my_list, filename):
    
    # Convert list to string
    my_list = 'n'.join(my_list)
    
    # Write string to file
    with open(filename, "w") as f:
        f.write(my_list)

# ----------------------------------------------------------------
# Hold the columns of output.txt into their own respective lists
# ----------------------------------------------------------------
f2_addresses = []
f2_data = []
for line_2 in file_2:
    address_2, data_2 = line_2.split(' ')
    f2_addresses.append(address_2)
    f2_data.append(data_2)


# --------------------------------------------------------------
# This will hold the lines for the 
#   - un_data.txt file  
#   - unmatch.txt file
# -------------------------------------------------------------
un_data_lines = []
unmatch_lines = []


# ---------------------------------------------------------------
# Iterate through input.txt and compare against output.txt
# ---------------------------------------------------------------
for line in file_1:
    address_1, data_1 = line.split(' ')
    
    # Check if address_1 is in output.txt 
    if address_1 in f2_addresses:
        
        # Get the index of the address within the f2_addresses list
        address_index = f2_addresses.index(address_1)
        
        # Compare the data columns - if do not match - append lines to un_data_lines
        if data_1 != f2_data[address_index]:
            un_data_lines.append(f'{address_1} {data_1}')
            un_data_lines.append(f'{f2_addresses[address_index]} {f2_data[address_index]}')
        
    else:
        unmatch_lines.append(f'{address_1} {data_1}')
        
        
# ------------------------------------------       
# Store the lists into files
# ------------------------------------------
list_to_file(un_data_lines, 'un_data.txt')
list_to_file(unmatch_lines, 'unmatch.txt')


OUTPUT:

unmatch.txt

00000000672fa001 jhwdggdiuqwhuhius


un_data.txt

000000008b8c5200 bcbiubcbueqihd167
000000008b8c5200 bdqjhwdjhfjjehfiu
User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement