Loop does not iterate over all data

I have code that produces the following df as output:

      year month   day      category                                                              keywords
0   '2021'  '09'  '06'          'us'                                      ['afghan, refugees, volunteers']
1   '2021'  '09'  '05'          'us'                         ['politics' 'military, drone, strike, kabul']
2   '2021'  '09'  '06'    'business'                                           ['rto, return, to, office']
3   '2021'  '09'  '06'    'nyregion'                                     ['nyc, jewish, high, holy, days']
4   '2021'  '09'  '06'       'world'                       ['americas' 'mexico, migrants, asylum, border']
5   '2021'  '09'  '06'          'us'                                        ['TAHOE, CALDORFIRE, WORKERS']
6   '2021'  '09'  '06'    'nyregion'                                         ['queens, flooding, cleanup']
7   '2021'  '09'  '05'          'us'  ['new, orleans, power, failure, traps, older, residents, in, homes']
8   '2021'  '09'  '05'    'nyregion'                              ['biden, flood, new, york, new, jersey']
9   '2021'  '09'  '06'  'technology'                         ['freedom, phone, smartphone, conservatives']
10  '2021'  '09'  '06'      'sports'                         ['football' 'nfl, preview, nfc, predictions']
11  '2021'  '09'  '06'      'sports'                         ['football' 'nfl, preview, afc, predictions']
12  '2021'  '09'  '06'     'opinion'                                    ['texas, abortion, september, 11']
13  '2021'  '09'  '06'     'opinion'                       ['coronavirus, masks, school, board, meetings']
14  '2021'  '09'  '06'     'opinion'                     ['south, republicans, vaccines, climate, change']
15  '2021'  '09'  '06'     'opinion'                                            ['labor, workers, rights']
16  '2021'  '09'  '05'     'opinion'                                             ['ku, kluxism, trumpism']
17  '2021'  '09'  '05'     'opinion'                            ['culture' 'sexually, harassed, pentagon']
18  '2021'  '09'  '05'     'opinion'                         ['parenting, college, empty, nest, pandemic']
19  '2021'  '09'  '04'     'opinion'                                    ['letters' 'coughlin, caregiving']
20  '2021'  '08'  '24'     'opinion'                            ['kara, swisher, maggie, haberman, event']
21  '2021'  '09'  '05'     'opinion'                                           ['labor, day, us, history']
22  '2021'  '09'  '04'     'opinion'                              ['drowning, our, future, in, the, past']
23  '2021'  '09'  '04'     'opinion'                                      ['biden, job, approval, rating']
24  '2021'  '09'  '05'     'opinion'                                    ['dorothy, day, christian, labor']
25  '2021'  '09'  '03'    'business'                                              ['goodbye, office, mom']
26  '2021'  '09'  '06'    'business'                            ['media' 'burn, out, companies, pandemic']
27  '2021'  '08'  '30'        'arts'                              ['music' 'popcast, lorde, solar, power']
28  '2021'  '09'  '02'     'opinion'               ['sway, kara, swisher, julie, cordua, ashton, kutcher']
29  '2021'  '08'  '12'     'science'                                    ['fauci, kids, and, covid, event']
30  '2021'  '09'  '05'          'us'                                       ['shooting, lakeland, florida']
31  '2021'  '09'  '05'    'business'                                    ['media' 'leah, finnegan, gawker']
32  '2021'  '09'  '06'    'nyregion'                                     ['piping, plovers, bird, rescue']
33  '2021'  '09'  '05'          'us'                              ['anti, abortion, movement, texas, law']
34  '2021'  '09'  '05'          'us'                          ['politics' 'bernie, sanders, budget, bill']
35  '2021'  '09'  '05'       'world'                                             ['africa' 'guinea, coup']
36  '2021'  '09'  '05'      'sports'                             ['soccer' 'brazil, argentina, suspended']
37  '2021'  '09'  '06'       'world'              ['africa' 'south, africa, jacob, zuma, medical, parole']
38  '2021'  '09'  '05'      'sports'                                              ['nfl, social, justice']
39  '2021'  '09'  '02'        'well'                                               ['go, bag, essentials']
40  '2021'  '09'  '01'   'parenting'                                          ['raising, resilient, kids']
41  '2021'  '09'  '03'       'books'                             ['911, anniversary, fiction, literature']
42  '2021'  '09'  '01'        'arts'                                  ['design' 'german, hygiene, museum']
43  '2021'  '09'  '03'        'arts'                                        ['music' 'opera, livestreams']
44  '2021'  '09'  '04'       'style'                            ['the, return, of, the, dream, honeymoon']
<class 'str'>

JavaScript
​x
 
      year month   day      category                                                              keywords
 '2021'  '09'  '06'          'us'                                      ['afghan, refugees, volunteers']
 '2021'  '09'  '05'          'us'                         ['politics' 'military, drone, strike, kabul']
 '2021'  '09'  '06'    'business'                                           ['rto, return, to, office']
 '2021'  '09'  '06'    'nyregion'                                     ['nyc, jewish, high, holy, days']
 '2021'  '09'  '06'       'world'                       ['americas' 'mexico, migrants, asylum, border']
 '2021'  '09'  '06'          'us'                                        ['TAHOE, CALDORFIRE, WORKERS']
 '2021'  '09'  '06'    'nyregion'                                         ['queens, flooding, cleanup']
 '2021'  '09'  '05'          'us'  ['new, orleans, power, failure, traps, older, residents, in, homes']
 '2021'  '09'  '05'    'nyregion'                              ['biden, flood, new, york, new, jersey']
 '2021'  '09'  '06'  'technology'                         ['freedom, phone, smartphone, conservatives']
'2021'  '09'  '06'      'sports'                         ['football' 'nfl, preview, nfc, predictions']
'2021'  '09'  '06'      'sports'                         ['football' 'nfl, preview, afc, predictions']
'2021'  '09'  '06'     'opinion'                                    ['texas, abortion, september, 11']
'2021'  '09'  '06'     'opinion'                       ['coronavirus, masks, school, board, meetings']
'2021'  '09'  '06'     'opinion'                     ['south, republicans, vaccines, climate, change']
'2021'  '09'  '06'     'opinion'                                            ['labor, workers, rights']
'2021'  '09'  '05'     'opinion'                                             ['ku, kluxism, trumpism']
'2021'  '09'  '05'     'opinion'                            ['culture' 'sexually, harassed, pentagon']
'2021'  '09'  '05'     'opinion'                         ['parenting, college, empty, nest, pandemic']
'2021'  '09'  '04'     'opinion'                                    ['letters' 'coughlin, caregiving']
'2021'  '08'  '24'     'opinion'                            ['kara, swisher, maggie, haberman, event']
'2021'  '09'  '05'     'opinion'                                           ['labor, day, us, history']
'2021'  '09'  '04'     'opinion'                              ['drowning, our, future, in, the, past']
'2021'  '09'  '04'     'opinion'                                      ['biden, job, approval, rating']
'2021'  '09'  '05'     'opinion'                                    ['dorothy, day, christian, labor']
'2021'  '09'  '03'    'business'                                              ['goodbye, office, mom']
'2021'  '09'  '06'    'business'                            ['media' 'burn, out, companies, pandemic']
'2021'  '08'  '30'        'arts'                              ['music' 'popcast, lorde, solar, power']
'2021'  '09'  '02'     'opinion'               ['sway, kara, swisher, julie, cordua, ashton, kutcher']
'2021'  '08'  '12'     'science'                                    ['fauci, kids, and, covid, event']
'2021'  '09'  '05'          'us'                                       ['shooting, lakeland, florida']
'2021'  '09'  '05'    'business'                                    ['media' 'leah, finnegan, gawker']
'2021'  '09'  '06'    'nyregion'                                     ['piping, plovers, bird, rescue']
'2021'  '09'  '05'          'us'                              ['anti, abortion, movement, texas, law']
'2021'  '09'  '05'          'us'                          ['politics' 'bernie, sanders, budget, bill']
'2021'  '09'  '05'       'world'                                             ['africa' 'guinea, coup']
'2021'  '09'  '05'      'sports'                             ['soccer' 'brazil, argentina, suspended']
'2021'  '09'  '06'       'world'              ['africa' 'south, africa, jacob, zuma, medical, parole']
'2021'  '09'  '05'      'sports'                                              ['nfl, social, justice']
'2021'  '09'  '02'        'well'                                               ['go, bag, essentials']
'2021'  '09'  '01'   'parenting'                                          ['raising, resilient, kids']
'2021'  '09'  '03'       'books'                             ['911, anniversary, fiction, literature']
'2021'  '09'  '01'        'arts'                                  ['design' 'german, hygiene, museum']
'2021'  '09'  '03'        'arts'                                        ['music' 'opera, livestreams']
'2021'  '09'  '04'       'style'                            ['the, return, of, the, dream, honeymoon']
<class 'str'>
​

I built a for loop to iterate over all the elements in the ‘keyword’ column and put them separately into a new df called df1. The loop look like this:

df1 = pd.DataFrame(columns=['word'])

i = 0

for p in df.loc[i, 'keywords']:

    teststr = df.loc[i, 'keywords']


    splitstr = teststr.split()

    u = 0

    for p1 in splitstr:
        dict_1 = {'word': splitstr[u]}
        df1.loc[len(df1)] = dict_1
        u = u + 1

    i = i + 1

print(df1)

JavaScript
 
df1 = pd.DataFrame(columns=['word'])
​
i = 0
​
for p in df.loc[i, 'keywords']:
​
    teststr = df.loc[i, 'keywords']
​
​
    splitstr = teststr.split()
​
    u = 0
​
    for p1 in splitstr:
        dict_1 = {'word': splitstr[u]}
        df1.loc[len(df1)] = dict_1
        u = u + 1
​
    i = i + 1
​
print(df1)
​

The output it produces is:

                word
0          ['afghan,
1          refugees,
2       volunteers']
3        ['politics'
4         'military,
5             drone,
6            strike,
7            kabul']
8             ['rto,
9            return,
10               to,
11          office']
12            ['nyc,
13           jewish,
14             high,
15             holy,
16            days']
17       ['americas'
18          'mexico,
19         migrants,
20           asylum,
21          border']
22          ['TAHOE,
23       CALDORFIRE,
24         WORKERS']
25         ['queens,
26         flooding,
27         cleanup']
28            ['new,
29          orleans,
30            power,
31          failure,
32            traps,
33            older,
34        residents,
35               in,
36           homes']
37          ['biden,
38            flood,
39              new,
40             york,
41              new,
42          jersey']
43        ['freedom,
44            phone,
45       smartphone,
46   conservatives']
47       ['football'
48             'nfl,
49          preview,
50              nfc,
51     predictions']
52       ['football'
53             'nfl,
54          preview,
55              afc,
56     predictions']
57          ['texas,
58         abortion,
59        september,
60              11']
61    ['coronavirus,
62            masks,
63           school,
64            board,
65        meetings']
66          ['south,
67      republicans,
68         vaccines,
69          climate,
70          change']
71          ['labor,
72          workers,
73          rights']
74             ['ku,
75          kluxism,
76        trumpism']
77        ['culture'
78        'sexually,
79         harassed,
80        pentagon']
81      ['parenting,
82          college,
83            empty,
84             nest,
85        pandemic']
86        ['letters'
87        'coughlin,
88      caregiving']
89           ['kara,
90          swisher,
91           maggie,
92         haberman,
93           event']
94          ['labor,
95              day,
96               us,
97         history']
98       ['drowning,
99              our,
100          future,
101              in,
102             the,
103           past']
104         ['biden,
105             job,
106        approval,
107         rating']
108       ['dorothy,
109             day,
110       christian,
111          labor']
112       ['goodbye,
113          office,
114            mom']
115         ['media'
116           'burn,
117             out,
118       companies,
119       pandemic']
120         ['music'
121        'popcast,
122           lorde,
123           solar,
124          power']
125          ['sway,
126            kara,
127         swisher,
128           julie,
129          cordua,
130          ashton,
131        kutcher']
132         ['fauci,
133            kids,
134             and,
135           covid,
136          event']
137      ['shooting,
138        lakeland,
139        florida']
140         ['media'
141           'leah,
142        finnegan,
143         gawker']

JavaScript
 
                word
        ['afghan,
        refugees,
     volunteers']
      ['politics'
       'military,
           drone,
          strike,
          kabul']
           ['rto,
          return,
             to,
        office']
          ['nyc,
         jewish,
           high,
           holy,
          days']
     ['americas'
        'mexico,
       migrants,
         asylum,
        border']
        ['TAHOE,
     CALDORFIRE,
       WORKERS']
       ['queens,
       flooding,
       cleanup']
          ['new,
        orleans,
          power,
        failure,
          traps,
          older,
      residents,
             in,
         homes']
        ['biden,
          flood,
            new,
           york,
            new,
        jersey']
      ['freedom,
          phone,
     smartphone,
 conservatives']
     ['football'
           'nfl,
        preview,
            nfc,
   predictions']
     ['football'
           'nfl,
        preview,
            afc,
   predictions']
        ['texas,
       abortion,
      september,
            11']
  ['coronavirus,
          masks,
         school,
          board,
      meetings']
        ['south,
    republicans,
       vaccines,
        climate,
        change']
        ['labor,
        workers,
        rights']
           ['ku,
        kluxism,
      trumpism']
      ['culture'
      'sexually,
       harassed,
      pentagon']
    ['parenting,
        college,
          empty,
           nest,
      pandemic']
      ['letters'
      'coughlin,
    caregiving']
         ['kara,
        swisher,
         maggie,
       haberman,
         event']
        ['labor,
            day,
             us,
       history']
     ['drowning,
            our,
        future,
            in,
           the,
         past']
       ['biden,
           job,
      approval,
       rating']
     ['dorothy,
           day,
     christian,
        labor']
     ['goodbye,
        office,
          mom']
       ['media'
         'burn,
           out,
     companies,
     pandemic']
       ['music'
      'popcast,
         lorde,
         solar,
        power']
        ['sway,
          kara,
       swisher,
         julie,
        cordua,
        ashton,
      kutcher']
       ['fauci,
          kids,
           and,
         covid,
        event']
    ['shooting,
      lakeland,
      florida']
       ['media'
         'leah,
      finnegan,
       gawker']
​

Although the for loop works fine, it does not iterate over all the rows from df and stops more or less in the middle (it doesn’t stop always at the same spot).

Do you have an idea why? Thanks in advance

Answer

I dont think your for iterates all the rows.
do this: for i in range(len(df)):
Also you can remove i = i + 1

Advertisement

Answer