I started with an empty DataFrame like this:
df = pd.DataFrame(columns=('lib', 'qty1', 'qty2'))
I can add a value to a single cell using:
df = df._set_value(index=len(df), col='qty1', value=10.0)
But this only works for one field at a time and seems cumbersome. What is a better, more efficient way to append a full new row to a Pandas DataFrame?
Instead of setting individual cells, you can create a dictionary or a Series with the full row’s data and append it.
For example:
new_row = {'lib': 'A', 'qty1': 10, 'qty2': 20}
df = df.append(new_row, ignore_index=True)
This way, you add the entire row at once. Just remember append()
returns a new DataFrame, so always reassign it. This is simple but not the most efficient for many rows.
Appending rows one by one using append()
is easy but can be slow on large datasets because it copies the DataFrame each time. For better performance, collect all rows in a list first:
rows = []
rows.append({'lib': 'A', 'qty1': 10, 'qty2': 20})
# ...append more rows
df = pd.DataFrame(rows)
This approach minimizes overhead and creates your DataFrame
in one go.
Thanks @alveera.khn
Also, if you want to append rows repeatedly but efficiently, avoid append()
in a loop. Instead, build a list of dictionaries or DataFrames
and concatenate at the end:
rows = []
rows.append({'lib': 'A', 'qty1': 10, 'qty2': 20})
rows.append({'lib': 'B', 'qty1': 5, 'qty2': 15})
df = pd.concat([df, pd.DataFrame(rows)], ignore_index=True)
This balances convenience and performance, especially when dealing with many rows.