Skip to content
Advertisement

Python and Matplotlib: characters as the x axis

Hi Stack Overflow community. I think I am trying to code the impossible with matplotlib, so if there is a different python library that will better suit me, please let me know!

I have an entire amino acid sequence (Represented as capital letters in the image) of a protein (protein x). This will be my x axis.

I have two excel columns: Disease and Control. These columns contain parts of the whole protein x’s amino acid sequence. Sometimes there are multiple hits where the disease or control column will contain two of an identical amino acid section of protein x. I want these to stack on top of each other so that one can see how many hits the disease and control have on protein x.

Confusing? sorry, here’s a sample of what I was able to come up with using powerpoint.

Amino Acid Comparison

The black text is the reference sequence. Purple is control. Pink is disease. Make sense now?

I need to do this with a HUGE dataset, so no, I do not want to “just use powerpoint for hours”. I also want to do it with any reference sequence of my choosing.

I’m not asking someone to do my job for me. I need someone to point me in the right direction. Is there a special library? Should I be converting everything into numbers and then relabeling as text?

Thanks and I appreciate any advice.

Advertisement

Answer

Create an SVG image, which is an XML text, using a script. I will tackle something simpler!

Suppose your target is this. overall image

Begin by breaking the big string at each place where there will be a column of string fragments, in this case, at ‘EF’ and ‘IJKL’. You can position the fragments of the big string using features of the SVG XML (more presently). Since you know the beginning positions of the fragments and the heights of characters you can position layers in the columns.

This is the kind of thingy you would have to build.

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!-- Created with Inkscape (http://www.inkscape.org/) -->

<svg
   xmlns:dc="http://purl.org/dc/elements/1.1/"
   xmlns:cc="http://creativecommons.org/ns#"
   xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
   xmlns:svg="http://www.w3.org/2000/svg"
   xmlns="http://www.w3.org/2000/svg"
   xmlns:sodipodi="http://sodipodi.sourceforge.net/DTD/sodipodi-0.dtd"
   xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"
   width="210mm"
   height="297mm"
   viewBox="0 0 210 297"
   version="1.1"
   id="svg8"
   inkscape:version="0.92.0 r15299"
   sodipodi:docname="genes.svg">
  <defs
     id="defs2" />
  <sodipodi:namedview
     id="base"
     pagecolor="#ffffff"
     bordercolor="#666666"
     borderopacity="1.0"
     inkscape:pageopacity="0.0"
     inkscape:pageshadow="2"
     inkscape:zoom="1.4"
     inkscape:cx="170.60599"
     inkscape:cy="341.08014"
     inkscape:document-units="mm"
     inkscape:current-layer="layer1"
     showgrid="false"
     inkscape:window-width="1095"
     inkscape:window-height="676"
     inkscape:window-x="145"
     inkscape:window-y="122"
     inkscape:window-maximized="0" />
  <metadata
     id="metadata5">
    <rdf:RDF>
      <cc:Work
         rdf:about="">
        <dc:format>image/svg+xml</dc:format>
        <dc:type
           rdf:resource="http://purl.org/dc/dcmitype/StillImage" />
        <dc:title></dc:title>
      </cc:Work>
    </rdf:RDF>
  </metadata>
  <g
     inkscape:label="Layer 1"
     inkscape:groupmode="layer"
     id="layer1">
    <text
       xml:space="preserve"
       style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:9.8777771px;line-height:6.61458302px;font-family:Courier;-inkscape-font-specification:Courier;font-variant-ligatures:normal;font-variant-caps:normal;font-variant-numeric:normal;font-feature-settings:normal;text-align:start;letter-spacing:0px;word-spacing:0px;writing-mode:lr-tb;text-anchor:start;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.26458332;"
       x="24.588797"
       y="179.4014"
       id="text12"><tspan
         sodipodi:role="line"
         id="tspan10"
         x="24.588797"
         y="185.32886"
         style="stroke-width:0.26458332;-inkscape-font-specification:Courier;font-family:Courier;font-weight:normal;font-style:normal;font-stretch:normal;font-variant:normal;" /></text>
    <text
       xml:space="preserve"
       style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:9.8777771px;line-height:6.61458302px;font-family:Calibri;-inkscape-font-specification:'Calibri, Normal';font-variant-ligatures:normal;font-variant-caps:normal;font-variant-numeric:normal;font-feature-settings:normal;text-align:start;letter-spacing:0px;word-spacing:0px;writing-mode:lr-tb;text-anchor:start;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.26458332"
       x="23.8125"
       y="207.41963"
       id="text24"><tspan
         sodipodi:role="line"
         id="tspan22"
         x="23.8125"
         y="207.41963"
         style="stroke-width:0.26458332">ABCD</tspan></text>
    <text
       xml:space="preserve"
       style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:9.8777771px;line-height:6.61458302px;font-family:Calibri;-inkscape-font-specification:'Calibri, Normal';font-variant-ligatures:normal;font-variant-caps:normal;font-variant-numeric:normal;font-feature-settings:normal;text-align:start;letter-spacing:0px;word-spacing:0px;writing-mode:lr-tb;text-anchor:start;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.26458332"
       x="46.302082"
       y="207.41965"
       id="text28"><tspan
         sodipodi:role="line"
         id="tspan26"
         x="46.302082"
         y="207.41963"
         style="stroke-width:0.26458332">EFGH</tspan></text>
    <text
       xml:space="preserve"
       style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:9.8777771px;line-height:6.61458302px;font-family:Calibri;-inkscape-font-specification:'Calibri, Normal';font-variant-ligatures:normal;font-variant-caps:normal;font-variant-numeric:normal;font-feature-settings:normal;text-align:start;letter-spacing:0px;word-spacing:0px;writing-mode:lr-tb;text-anchor:start;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.26458332"
       x="67.657738"
       y="207.41963"
       id="text32"><tspan
         sodipodi:role="line"
         id="tspan30"
         x="67.657738"
         y="207.41963"
         style="stroke-width:0.26458332">IJKLMN</tspan></text>
    <text
       xml:space="preserve"
       style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:9.8777771px;line-height:6.61458302px;font-family:Calibri;-inkscape-font-specification:'Calibri, Normal';font-variant-ligatures:normal;font-variant-caps:normal;font-variant-numeric:normal;font-feature-settings:normal;text-align:start;letter-spacing:0px;word-spacing:0px;writing-mode:lr-tb;text-anchor:start;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.26458332"
       x="46.680061"
       y="199.67113"
       id="text36"><tspan
         sodipodi:role="line"
         id="tspan34"
         x="46.302082"
         y="199.67113"
         style="stroke-width:0.26458332">EF</tspan></text>
    <text
       xml:space="preserve"
       style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-size:9.8777771px;line-height:6.61458302px;font-family:Calibri;-inkscape-font-specification:'Calibri, Normal';font-variant-ligatures:normal;font-variant-caps:normal;font-variant-numeric:normal;font-feature-settings:normal;text-align:start;letter-spacing:0px;word-spacing:0px;writing-mode:lr-tb;text-anchor:start;fill:#000000;fill-opacity:1;stroke:none;stroke-width:0.26458332"
       x="67.846725"
       y="192.86755"
       id="text40"><tspan
         sodipodi:role="line"
         id="tspan38"
         x="67.657738"
         y="192.86755"
         style="stroke-width:0.26458332">IJKL</tspan></text>
  </g>
</svg>

Obviously I’ve done it in Inkscape but you’ll get the idea. There’s nothing here that can’t be done in Python

User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement