With the help of some plugin, I get a .bib file with information about scientific articles. Sometimes it turns out that the same keys appear in different records.
For example:
@inproceedings{Hosseini_2016, doi = {10.1109/ism.2016.0028}, url = {https://doi.org/10.1109%2Fism.2016.0028}, year = 2016, month = {dec}, publisher = {{IEEE}}, author = {Mohammad Hosseini and Viswanathan Swaminathan}, title = {Adaptive 360 {VR} Video Streaming: Divide and Conquer}, booktitle = {2016 {IEEE} International Symposium on Multimedia ({ISM})} } @inproceedings{Hosseini_2016, doi = {10.1109/ism.2016.0093}, url = {https://doi.org/10.1109%2Fism.2016.0093}, year = 2016, month = {dec}, publisher = {{IEEE}}, author = {Mohammad Hosseini and Viswanathan Swaminathan}, title = {Adaptive 360 {VR} Video Streaming Based on {MPEG}-{DASH} {SRD}}, booktitle = {2016 {IEEE} International Symposium on Multimedia ({ISM})}
I am using pybtex library to parse a file. This library ignores duplicate entries with the same keys. Before using this library, I need to somehow process the file so that all the keys in it are different. How can I do that?
Advertisement
Answer
I decided to use regular expressions. There is probably a more convenient solution. I just replace the keys with nanoid.
from nanoid import generate def process_bibtex(fn): with open(fn, encoding="utf-8") as r_file: bibtex = r_file.read() pattern = r"@([wW]+?){([wW0-9_-]+?)," def callback(matchobj): return f"@{matchobj.group(1)}{{{generate()}," with open(fn, "w", encoding="utf-8") as w_file: w_file.write(re.sub(pattern, callback, bibtex))