Skip to content
Advertisement

Obfuscate file name and folder path

I am working on a git repo and I need to share folder hierarchy and file names to external vendor to perform some code analysis. I have whole hierarchy available in a csv file.

Problem is that I cannot provide actual folder paths or file names as they contain protected information. For code analysis, external vendor only needs folder paths and file names. They can utilize that information and provide output of code analysis. Internally, we need to have mapping available of actual vs obfuscated file paths / names.

Example of this mapping would be: conf1/conf2/conf3.txt -> dsdasd/dsadsd/dadssd.txt conf1/conf2/conf4.py -> dsdasd/dsadsd/dasdsd.py

Manual mapping is not feasible as the repo contains over 200k files with 20 level deep folder hierarchy. There are 2 requirements for this conversion:

  1. Extension should be retained
  2. Same folder path should have same obfuscated remapping

Advertisement

Answer

I’ll describe how I’d go about this in pseudocode.

NEXT := 1
MAP := empty
for each full path P in your repos
  split P using '/' as the delimiter
  for each element E of the split path
    if it is the last element, remove the extension
    if E is in the MAP
      CODE := MAP[E]
    else
      CODE := NEXT
      increase NEXT
      MAP[E] := CODE
    replace E with CODE
    if it is the last element, put back the extension
  join the transformed elements using '/' as the delimiter
  print the result

This will convert:

conf1/conf2/conf3.txt -> 1/2/3.txt
conf1/conf2/conf4.py -> 1/2/4.py

and meets your requirements. If you need to literally obfuscate the path, then you should use some unique random word, instead of NEXT, in the pseudocode above.

User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement