Skip to content
Advertisement

Sorting and Re-formatting a text file

Need help with the following:

  1. Reading from one string until another specified string.
  2. Merging 2 strings on separate lines onto 1 line I tried strip() and this was not successful.
  3. Create 2 separate arrays from the text provided.

Given:

Cat Chores
Get 
cat food.
Dog Chores
Get
dog food. Walk Dog.

Desired output:

Cat Chores
Get cat food.

These sentences are separated because they will be put in an array.

Dog Chores
Get dog food. Walk Dog.

Final output:

cat_chores = [Get cat food.]
dog_chores = [Get dog food. , Walk Dog]

Here is my code:

# Remove whitespace and reformat the file
with open('chores.txt',"r") as f:
  text = input.read()
  text = [lines.strip() for lines in text] 

with open('chores.txt',"w") as f:
  f.writelines(text)
  f.close

# Re-open the file to create the arrays.
with open('chores.txt',"r") as f:
  text = input.read()

if "Cat Chores" in text:
  print (line,end='')
  print(next(input),end='')

if "Dog Chores" in text:
  print (line,end='')
  print(next(input),end='')

Advertisement

Answer

Try this:

chores = {}
action = ''
with open('chores.txt', 'r') as f:
    for line in f.read().splitlines():
        line = line.strip()  # your original data had trailing spaces. this is to remove them
        if 'Chores' in line:  # check if line is a grouping
            current_chore = line
            chores[current_chore] = []
        elif len(line.split(' ')) == 1:  # check if line is an action
            action = line + ' '
            continue
        else:
            chores[current_chore].append(action + line)
            action = ''

with open('chores.txt', 'w') as f:
    f.write(str(chores))
    f.close

It will output:

{'Cat Chores': ['Get cat food.'], 'Dog Chores': ['Get dog food. Walk Dog.']}

This assumes that a grouping always contains ‘Chores’, action is always 1 word, and outputs a string of a dictionary. My version doesn’t separate ‘Get dog food.’ and ‘Walk Dog.’ but if you want that you can add a split() on ‘. ‘ and handle it. The formatting of your input data is horrible and really shouldn’t be used as is.

User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement