Skip to content
Advertisement

Python: How to encode DNA sequence using binary values?

I would like to convert a file that contained few DNA sequences into binary values which is as follow:

JavaScript

FileA.txt

JavaScript

Desired output

JavaScript

I have tried using this code to solve my problem but the bin output file seem failed to output my desired answer. Can anyone help me?

Code

JavaScript

Advertisement

Answer

Do you want ascii output or binary? The below will give you what you show in your post (though on a single line. Code needs to be modified to keep newlines).

JavaScript

EDIT This creates a binary file where each nucleotide is a byte and the newlines are preserved in binary format.

JavaScript

Edit 2 (moving my comment to the answer body): If this isn’t for an assignment or to interface with someone’s software, I recommend encoding your nucleotides as 0b00, 0b01,0b10, and 0b11 to save time and space. You can still use the 4-bit 0b1010 newline character to separate nucleotide sequences.

User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement