With the help of some plugin, I get a .bib file with information about scientific articles. Sometimes it turns out that the same keys appear in different records. For example: I am using pybtex library to parse a file. This library ignores duplicate entries with the same keys. Before using this library, I need to somehow process the file so

How to change the same keys related to different articles in bibtex?

With the help of some plugin, I get a .bib file with information about scientific articles. Sometimes it turns out that the same keys appear in different records.

For example:

@inproceedings{Hosseini_2016,
    doi = {10.1109/ism.2016.0028},
    url = {https://doi.org/10.1109%2Fism.2016.0028},
    year = 2016,
    month = {dec},
    publisher = {{IEEE}},
    author = {Mohammad Hosseini and Viswanathan Swaminathan},
    title = {Adaptive 360 {VR} Video Streaming: Divide and Conquer},
    booktitle = {2016 {IEEE} International Symposium on Multimedia ({ISM})}
}
@inproceedings{Hosseini_2016,
    doi = {10.1109/ism.2016.0093},
    url = {https://doi.org/10.1109%2Fism.2016.0093},
    year = 2016,
    month = {dec},
    publisher = {{IEEE}},
    author = {Mohammad Hosseini and Viswanathan Swaminathan},
    title = {Adaptive 360 {VR} Video Streaming Based on {MPEG}-{DASH} {SRD}},
    booktitle = {2016 {IEEE} International Symposium on Multimedia ({ISM})}

JavaScript
​x
 
@inproceedings{Hosseini_2016,
    doi = {10.1109/ism.2016.0028},
    url = {https://doi.org/10.1109%2Fism.2016.0028},
    year = 2016,
    month = {dec},
    publisher = {{IEEE}},
    author = {Mohammad Hosseini and Viswanathan Swaminathan},
    title = {Adaptive 360 {VR} Video Streaming: Divide and Conquer},
    booktitle = {2016 {IEEE} International Symposium on Multimedia ({ISM})}
}
@inproceedings{Hosseini_2016,
    doi = {10.1109/ism.2016.0093},
    url = {https://doi.org/10.1109%2Fism.2016.0093},
    year = 2016,
    month = {dec},
    publisher = {{IEEE}},
    author = {Mohammad Hosseini and Viswanathan Swaminathan},
    title = {Adaptive 360 {VR} Video Streaming Based on {MPEG}-{DASH} {SRD}},
    booktitle = {2016 {IEEE} International Symposium on Multimedia ({ISM})}
​

I am using pybtex library to parse a file. This library ignores duplicate entries with the same keys. Before using this library, I need to somehow process the file so that all the keys in it are different. How can I do that?

Answer

I decided to use regular expressions. There is probably a more convenient solution. I just replace the keys with nanoid.

from nanoid import generate

def process_bibtex(fn):
    with open(fn, encoding="utf-8") as r_file:
        bibtex = r_file.read()
    pattern = r"@([wW]+?){([wW0-9_-]+?),"
    def callback(matchobj):
        return f"@{matchobj.group(1)}{{{generate()},"
    with open(fn, "w", encoding="utf-8") as w_file:
        w_file.write(re.sub(pattern, callback, bibtex))

JavaScript
 
from nanoid import generate
​
def process_bibtex(fn):
    with open(fn, encoding="utf-8") as r_file:
        bibtex = r_file.read()
    pattern = r"@([wW]+?){([wW0-9_-]+?),"
    def callback(matchobj):
        return f"@{matchobj.group(1)}{{{generate()},"
    with open(fn, "w", encoding="utf-8") as w_file:
        w_file.write(re.sub(pattern, callback, bibtex))
​

Advertisement

Answer