I have a directory structure that looks like the image.

This structure is organised in a form of objects. File objects are each in the form: {type:'file', name:'file a1'} for example. Folderobjects are like File objects, but they have an extra key children which is an array of sub folders/files.
Here’s a console log of the rootFolder object from the image before:
The goal:
What I want is to write a search function that returns the path for a specific file in the form of an array. For example, if I were to search for file c3, the result would be: ['folder a3', 'folder b4'].
I have tried many things, but recursion isn’t my strongest suit.
The code could be in javascript or python; what I care about is the algorithm itself.
Here is the code used to generate rootFolder:
// defining constants
const TYPES = {FILE: 'file', FOLDER: 'folder'}
const FILE_NAMES = {
FILE_A1:'file a1',
FILE_A2:'file a2',
FILE_A4:'file a4',
FILE_B2:'file b2',
FILE_B3:'file b3',
FILE_C1:'file c1',
FILE_C2:'file c2',
FILE_C3:'file c3',
FILE_C4:'file c4',
FILE_C5:'file c5',
}
const FOLDER_NAMES = {
ROOT_FOLDER:'root folder',
FOLDER_A3:'folder a3',
FOLDER_B1:'folder b1',
FOLDER_B4:'folder b4'
}
// defining classes
class File {
constructor(type, name) {
this.name = name
this.type = type
}
}
class Folder {
constructor (type, name, children) {
this.name = name
this.type = type
this.children = children
}
}
// defining file objects
const fileA1 = new File(TYPES.FILE, FILE_NAMES.FILE_A1)
const fileA2 = new File(TYPES.FILE, FILE_NAMES.FILE_A2)
const fileA4 = new File(TYPES.FILE, FILE_NAMES.FILE_A4)
const fileB2 = new File(TYPES.FILE, FILE_NAMES.FILE_B2)
const fileB3 = new File(TYPES.FILE, FILE_NAMES.FILE_B3)
const fileC1 = new File(TYPES.FILE, FILE_NAMES.FILE_C1)
const fileC2 = new File(TYPES.FILE, FILE_NAMES.FILE_C2)
const fileC3 = new File(TYPES.FILE, FILE_NAMES.FILE_C3)
const fileC4 = new File(TYPES.FILE, FILE_NAMES.FILE_C4)
const fileC5 = new File(TYPES.FILE, FILE_NAMES.FILE_C5)
// defining folder objects
const folderB1 = new Folder(TYPES.FOLDER, FOLDER_NAMES.FOLDER_B1, [fileC1, fileC2])
const folderB4 = new Folder(TYPES.FOLDER, FOLDER_NAMES.FOLDER_B4, [fileC3, fileC4, fileC5])
const folderA3 = new Folder(TYPES.FOLDER, FOLDER_NAMES.FOLDER_A3, [folderB1, fileB2, fileB3, folderB4])
const rootFolder = new Folder(TYPES.FOLDER, FOLDER_NAMES.ROOT_FOLDER, [fileA1, fileA2, folderA3, fileA4])
console.log(rootFolder);
Advertisement
Answer
remove indrection
Before we get started, first remove the myriad of indirection caused by over 20 variables. We can use just two (2) classes and a single root –
class File {
constructor(type, name) {
this.name = name
this.type = type
}
}
class Folder {
constructor(type, name, children) {
this.name = name
this.type = type
this.children = children
}
}
const root =
new Folder("folder", "root folder", [
new File("file", "file a1"),
new File("file", "file a2"),
new Folder("folder", "folder a3", [
new Folder("folder", "folder b1", [
new File("file", "file c1"),
new File("file", "file c2")
]),
new File("file", "file b2"),
new File("file", "file b3"),
new Folder("folder", "folder b4", [
new File("file", "file c3"),
new File("file", "file c4"),
new File("file", "file c5")
])
]),
new File("file", "file a4")
])
remove redundancy
TYPES.FILE and TYPES.FOLDER are unneeded if you have separate File and Folder classes. Each instance will already have an associated type –
class File {
constructor(name) { // ✅ type not necessary
this.name = name
}
}
class Folder {
constructor (name, children) { // ✅ type not necessary
this.name = name
this.children = children
}
}
const root =
new Folder("root folder", [
new File("file a1"),
new File("file a2"),
new Folder("folder a3", [
new Folder("folder b1", [
new File("file c1"),
new File("file c2")
]),
new File("file b2"),
new File("file b3"),
new Folder("folder b4", [
new File("file c3"),
new File("file c4"),
new File("file c5")
])
]),
new File("file a4")
])
multiple return values
Because file systems support folders or files with the same name (at different hierarchies), it’s important that our search algorithm can scan for all possible matches of the query. For a complete example, let’s include file c3 in folder b1 and folder b4. When searching for this file, we expect to see both paths returned –
const root =
new Folder("root folder", [
new File("file a1"),
new File("file a2"),
new Folder("folder a3", [
new Folder("folder b1", [
new File("file c1"),
new File("file c2"),
new File("file c3") // ✅ could have the same name
]),
new File("file b2"),
new File("file b3"),
new Folder("folder b4", [
new File("file c3"), // ✅ could have the same name
new File("file c4"),
new File("file c5")
])
]),
new File("file a4")
])
the algorithm
Now that we’ve fixed the other issues with the code, let’s write search(t, query). We can do this using inductive reasoning –
- If
tis a File and itsnamematches thequery, yield the singleton result[t] - (inductive)
tis not a File. Iftis a Folder and itsnamematches thequeryinput the singleton result[t]. And for allchildoft.children, for allpathof the recursive sub-problemsearch(child, query), prependtto thepathand yield. - (inductive)
tis neither a File nor a Folder. Throw a type error to notify the caller.
function* search(t, query) {
switch(t?.constructor) {
case File:
if (t.name == query)
yield [t]
break
case Folder:
if (t.name == query)
yield [t]
for (const child of t.children)
for (const path of search(child, query))
yield [t, ...path]
break
default:
throw Error(`cannot search unsupported type ${t}`)
}
}
Here’s some search examples –
for (const path of search(root, "root folder"))
console.log([".", ...path.map(f => f.name)].join("/"))
./root folder
for (const path of search(root, "file a1"))
console.log([".", ...path.map(f => f.name)].join("/"))
./root folder/file a1
Multiple paths are returned when a file or folder is found multiple times –
for (const path of search(root, "file c3"))
console.log([".", ...path.map(f => f.name)].join("/"))
./root folder/folder a3/folder b1/file c3 ./root folder/folder a3/folder b4/file c3
When a file or folder cannot be found, no output is given –
for (const path of search(root, "file Q"))
console.log([".", ...path.map(f => f.name)].join("/"))
⚠️ no output
caller decides effect
By yielding a sequence of Folder and File objects, our search algorithm remains generic and reusable. Strings are not assembled by default and nothing is logged to the console. If the caller wishes to create a /-separated path, that is demonstrated above. Options for a different effect are left for the caller to decide.
only one result
Using a generator is perfect for search because the caller can choose to consume all matches or any number of them. In this example, we use a generic first to return only the first match. Note the generator is canceled immediately after a match is found and no additional work is done by the cpu –
function first(iter) {
for (const v of iter)
return v
}
const match = first(search(root, "file c3"))
if (match)
console.log("found:", match.map(f => f.name).join("/"))
else
console.error("no match found")
found: root folderfolder a3/folder b1/file c3
Make sure you do a proper null-check if using first in this way as it is possible that search yields nothing when no match is found –
const match = first(search(root, "ZZZ"))
if (match)
console.log("found:", match.map(f => f.name).join("/"))
else
console.error("no match found")
no match found
higher-order search
In the original algorithm matches are determined using ==. The algorithm could be more useful if we allow the caller to specify what constitutes a match –
function* search(t, query) {
switch(t?.constructor) {
case File:
if (Boolean(query(t.name))) // ✅ query is a func
yield [t]
break
case Folder:
if (Boolean(query(t.name))) // ✅ query is a func
yield [t]
for (const child of t.children)
for (const path of search(child, query))
yield [t, ...path]
break
default:
throw Error(`cannot search unsupported type ${t}`)
}
}
Let’s find all files that start with file c –
for (const path of search(root, filename => filename.startsWith("file c")))
console.log([".", ...path.map(f => f.name)].join("/"))
./root folder/folder a3/folder b1/file c1 ./root folder/folder a3/folder b1/file c2 ./root folder/folder a3/folder b1/file c3 ./root folder/folder a3/folder b4/file c3 ./root folder/folder a3/folder b4/file c4 ./root folder/folder a3/folder b4/file c5
specialization
With a higher-order search function, you can easily implement things like searchByString. This is known as a specialization –
function searchByString(t, query) {
return search(t, filename => filename == query)
}
for (const path of searchByString(root, "file c3"))
console.log([".", ...path.map(f => f.name)].join("/"))
./root folder/folder a3/folder b1/file c3 ./root folder/folder a3/folder b4/file c3
code demo
The code above has been condensed so you can see it all in one place. Run the snippet and verify the output –
class File { constructor(name) { this.name = name } }
class Folder { constructor(name, children) { this.name = name; this.children = children } }
function* search(t, query) {
switch(t?.constructor) {
case File: if (Boolean(query(t.name))) yield [t]; break
case Folder: if (Boolean(query(t.name))) yield [t]; for (const child of t.children) for (const path of search(child, query)) yield [t, ...path]; break
default: throw Error(`cannot search unsupported type ${t}`)
}
}
function searchByString(t, query) {
return search(t, filename => filename == query)
}
const relative = (path) => [".", ...path.map(f => f.name)].join("/")
const root = new Folder("root folder", [new File("file a1"), new File("file a2"), new Folder("folder a3", [new Folder("folder b1", [new File("file c1"), new File("file c2"), new File("file c3")]), new File("file b2"), new File("file b3"), new Folder("folder b4", [new File("file c3"), new File("file c4"), new File("file c5")])]), new File("file a4")])
console.log("== exact match ==")
for (const path of search(root, filename => filename == "root folder"))
console.log(relative(path))
console.log("== starts with 'file c' ==")
for (const path of search(root, filename => filename.startsWith("file c")))
console.log(relative(path))
console.log("== searchByString ==")
for (const path of searchByString(root, "file c3"))
console.log(relative(path)).as-console-wrapper { min-height: 100%; top: 0; }== exact match == ./root folder == starts with 'file c' == ./root folder/folder a3/folder b1/file c1 ./root folder/folder a3/folder b1/file c2 ./root folder/folder a3/folder b1/file c3 ./root folder/folder a3/folder b4/file c3 ./root folder/folder a3/folder b4/file c4 ./root folder/folder a3/folder b4/file c5 == searchByString == ./root folder/folder a3/folder b1/file c3 ./root folder/folder a3/folder b4/file c3
python
search is written exactly the same in Python –
class file:
def __init__(self, name):
self.name = name
class folder:
def __init__(self, name, children):
self.name = name
self.children = children
def search(t, query):
if isinstance(t, file):
if t.name == query:
yield [t]
elif isinstance(t, folder):
if t.name == query:
yield [t]
for child in t.children:
for path in search(child, query):
yield [t, *path]
else:
raise TypeError("unsupported", t)
If you are using Python 3.10, there are some new features available to you. Adjust search to take advantage of the new structural match..case syntax –
from dataclasses import dataclass
from typing import Generator, Union
Ftree = Union["file", "folder"]
@dataclass
class file:
name: str
@dataclass
class folder:
name: str
children: list[Ftree]
def search(t: Ftree, query: str) -> Generator[list[Ftree], None, None]:
match t:
case file(name):
if name == query: yield [t]
case folder(name, children):
if name == query: yield [t]
for child in t.children:
for path in search(child, query):
yield [t, *path]
case _:
raise TypeError("unsupported", t)
Given a tree –
root =
folder("root folder", [
file("file a1"),
file("file a2"),
folder("folder a3", [
folder("folder b1", [
file("file c1"),
file("file c2"),
file("file c3")
]),
file("file b2"),
file("file b3"),
folder("folder b4", [
file("file c3"),
file("file c4"),
file("file c5")
])
]),
file("file a4")
])
Both implementations behave identically –
for path in search(root, "root folder"):
print("/".join([".", *map(lambda f: f.name, path)]))
./root folder
for path in search(root, "file c3"):
print("/".join([".", *map(lambda f: f.name, path)]))
./root folder/folder a3/folder b1/file c3 ./root folder/folder a3/folder b4/file c3
