Skip to content
Advertisement

Loop does not iterate over all data

I have code that produces the following df as output:

      year month   day      category                                                              keywords
0   '2021'  '09'  '06'          'us'                                      ['afghan, refugees, volunteers']
1   '2021'  '09'  '05'          'us'                         ['politics' 'military, drone, strike, kabul']
2   '2021'  '09'  '06'    'business'                                           ['rto, return, to, office']
3   '2021'  '09'  '06'    'nyregion'                                     ['nyc, jewish, high, holy, days']
4   '2021'  '09'  '06'       'world'                       ['americas' 'mexico, migrants, asylum, border']
5   '2021'  '09'  '06'          'us'                                        ['TAHOE, CALDORFIRE, WORKERS']
6   '2021'  '09'  '06'    'nyregion'                                         ['queens, flooding, cleanup']
7   '2021'  '09'  '05'          'us'  ['new, orleans, power, failure, traps, older, residents, in, homes']
8   '2021'  '09'  '05'    'nyregion'                              ['biden, flood, new, york, new, jersey']
9   '2021'  '09'  '06'  'technology'                         ['freedom, phone, smartphone, conservatives']
10  '2021'  '09'  '06'      'sports'                         ['football' 'nfl, preview, nfc, predictions']
11  '2021'  '09'  '06'      'sports'                         ['football' 'nfl, preview, afc, predictions']
12  '2021'  '09'  '06'     'opinion'                                    ['texas, abortion, september, 11']
13  '2021'  '09'  '06'     'opinion'                       ['coronavirus, masks, school, board, meetings']
14  '2021'  '09'  '06'     'opinion'                     ['south, republicans, vaccines, climate, change']
15  '2021'  '09'  '06'     'opinion'                                            ['labor, workers, rights']
16  '2021'  '09'  '05'     'opinion'                                             ['ku, kluxism, trumpism']
17  '2021'  '09'  '05'     'opinion'                            ['culture' 'sexually, harassed, pentagon']
18  '2021'  '09'  '05'     'opinion'                         ['parenting, college, empty, nest, pandemic']
19  '2021'  '09'  '04'     'opinion'                                    ['letters' 'coughlin, caregiving']
20  '2021'  '08'  '24'     'opinion'                            ['kara, swisher, maggie, haberman, event']
21  '2021'  '09'  '05'     'opinion'                                           ['labor, day, us, history']
22  '2021'  '09'  '04'     'opinion'                              ['drowning, our, future, in, the, past']
23  '2021'  '09'  '04'     'opinion'                                      ['biden, job, approval, rating']
24  '2021'  '09'  '05'     'opinion'                                    ['dorothy, day, christian, labor']
25  '2021'  '09'  '03'    'business'                                              ['goodbye, office, mom']
26  '2021'  '09'  '06'    'business'                            ['media' 'burn, out, companies, pandemic']
27  '2021'  '08'  '30'        'arts'                              ['music' 'popcast, lorde, solar, power']
28  '2021'  '09'  '02'     'opinion'               ['sway, kara, swisher, julie, cordua, ashton, kutcher']
29  '2021'  '08'  '12'     'science'                                    ['fauci, kids, and, covid, event']
30  '2021'  '09'  '05'          'us'                                       ['shooting, lakeland, florida']
31  '2021'  '09'  '05'    'business'                                    ['media' 'leah, finnegan, gawker']
32  '2021'  '09'  '06'    'nyregion'                                     ['piping, plovers, bird, rescue']
33  '2021'  '09'  '05'          'us'                              ['anti, abortion, movement, texas, law']
34  '2021'  '09'  '05'          'us'                          ['politics' 'bernie, sanders, budget, bill']
35  '2021'  '09'  '05'       'world'                                             ['africa' 'guinea, coup']
36  '2021'  '09'  '05'      'sports'                             ['soccer' 'brazil, argentina, suspended']
37  '2021'  '09'  '06'       'world'              ['africa' 'south, africa, jacob, zuma, medical, parole']
38  '2021'  '09'  '05'      'sports'                                              ['nfl, social, justice']
39  '2021'  '09'  '02'        'well'                                               ['go, bag, essentials']
40  '2021'  '09'  '01'   'parenting'                                          ['raising, resilient, kids']
41  '2021'  '09'  '03'       'books'                             ['911, anniversary, fiction, literature']
42  '2021'  '09'  '01'        'arts'                                  ['design' 'german, hygiene, museum']
43  '2021'  '09'  '03'        'arts'                                        ['music' 'opera, livestreams']
44  '2021'  '09'  '04'       'style'                            ['the, return, of, the, dream, honeymoon']
<class 'str'>

I built a for loop to iterate over all the elements in the ‘keyword’ column and put them separately into a new df called df1. The loop look like this:

df1 = pd.DataFrame(columns=['word'])

i = 0

for p in df.loc[i, 'keywords']:

    teststr = df.loc[i, 'keywords']


    splitstr = teststr.split()

    u = 0

    for p1 in splitstr:
        dict_1 = {'word': splitstr[u]}
        df1.loc[len(df1)] = dict_1
        u = u + 1

    i = i + 1

print(df1)

The output it produces is:

                word
0          ['afghan,
1          refugees,
2       volunteers']
3        ['politics'
4         'military,
5             drone,
6            strike,
7            kabul']
8             ['rto,
9            return,
10               to,
11          office']
12            ['nyc,
13           jewish,
14             high,
15             holy,
16            days']
17       ['americas'
18          'mexico,
19         migrants,
20           asylum,
21          border']
22          ['TAHOE,
23       CALDORFIRE,
24         WORKERS']
25         ['queens,
26         flooding,
27         cleanup']
28            ['new,
29          orleans,
30            power,
31          failure,
32            traps,
33            older,
34        residents,
35               in,
36           homes']
37          ['biden,
38            flood,
39              new,
40             york,
41              new,
42          jersey']
43        ['freedom,
44            phone,
45       smartphone,
46   conservatives']
47       ['football'
48             'nfl,
49          preview,
50              nfc,
51     predictions']
52       ['football'
53             'nfl,
54          preview,
55              afc,
56     predictions']
57          ['texas,
58         abortion,
59        september,
60              11']
61    ['coronavirus,
62            masks,
63           school,
64            board,
65        meetings']
66          ['south,
67      republicans,
68         vaccines,
69          climate,
70          change']
71          ['labor,
72          workers,
73          rights']
74             ['ku,
75          kluxism,
76        trumpism']
77        ['culture'
78        'sexually,
79         harassed,
80        pentagon']
81      ['parenting,
82          college,
83            empty,
84             nest,
85        pandemic']
86        ['letters'
87        'coughlin,
88      caregiving']
89           ['kara,
90          swisher,
91           maggie,
92         haberman,
93           event']
94          ['labor,
95              day,
96               us,
97         history']
98       ['drowning,
99              our,
100          future,
101              in,
102             the,
103           past']
104         ['biden,
105             job,
106        approval,
107         rating']
108       ['dorothy,
109             day,
110       christian,
111          labor']
112       ['goodbye,
113          office,
114            mom']
115         ['media'
116           'burn,
117             out,
118       companies,
119       pandemic']
120         ['music'
121        'popcast,
122           lorde,
123           solar,
124          power']
125          ['sway,
126            kara,
127         swisher,
128           julie,
129          cordua,
130          ashton,
131        kutcher']
132         ['fauci,
133            kids,
134             and,
135           covid,
136          event']
137      ['shooting,
138        lakeland,
139        florida']
140         ['media'
141           'leah,
142        finnegan,
143         gawker']

Although the for loop works fine, it does not iterate over all the rows from df and stops more or less in the middle (it doesn’t stop always at the same spot).

Do you have an idea why? Thanks in advance

Advertisement

Answer

I dont think your for iterates all the rows.
do this: for i in range(len(df)):
Also you can remove i = i + 1

User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement