Appearance
Pandas
Read from or write to Excel files
Read from an Excel file
shell
pip3 install openpyxl
pip3 install openpyxl
The pandas.read_excel API reads an Excel file into a pandas DataFrame.
python
import pandas as pd
df = pd.read_excel('Apple Income Statement.xlsx', sheet_name='sheet_0', header=4, index_col=0)
import pandas as pd
df = pd.read_excel('Apple Income Statement.xlsx', sheet_name='sheet_0', header=4, index_col=0)
- header: default
0
Row (0-indexed) to use for the column labels of the parsed DataFrame. - index_colint: default
None
Column (0-indexed) to use as the row labels of the DataFrame. PassNone
if there is no such column.
Write to an Excel file
python
df.to_excel("path_to_file.xlsx",sheet_name="Sheet1", index=False)
df.to_excel("path_to_file.xlsx",sheet_name="Sheet1", index=False)
Select rows
Select by one column value
python
import numpy as np
df2 = df.iloc[np.array(df.is_first_launch==0)]
import numpy as np
df2 = df.iloc[np.array(df.is_first_launch==0)]
Sort rows
The sort_values API can sort the dataframe by the values along either axis.
python
sorted_df = df.sort_values(by=['launch_time'], ascending=True)
sorted_df = df.sort_values(by=['launch_time'], ascending=True)
- axis: default
0
Axis to be sorted,0
orindex
for sorting rows,1
orcolumns
for sorting columns. - by:
Name or list of names to sort by.
Add a column by applying a lambda
python
# Import the library
import pandas as pd
# dataframe
df = pd.DataFrame({'Name': ['John', 'Jack', 'Shri',
'Krishna', 'Smith', 'Tessa'],
'Maths': [5, 3, 9, 10, 6, 3]})
# Adding the result column
df['Result'] = df['Maths'].apply(lambda x: 'Pass' if x>=5 else 'Fail')
print(df)
# Import the library
import pandas as pd
# dataframe
df = pd.DataFrame({'Name': ['John', 'Jack', 'Shri',
'Krishna', 'Smith', 'Tessa'],
'Maths': [5, 3, 9, 10, 6, 3]})
# Adding the result column
df['Result'] = df['Maths'].apply(lambda x: 'Pass' if x>=5 else 'Fail')
print(df)
python
# Import the library
import pandas as pd
# dataframe
df = pd.DataFrame({'Name': ['John', 'Jack', 'Shri',
'Krishna', 'Smith', 'Tessa'],
'Maths': [5, 3, 9, 10, 6, 3]})
# Defining all the conditions inside a function
def condition(x):
if x>8:
return "No need"
elif x>=5 and x<=7:
return "Hold decision"
else:
return 'Need'
# Applying the conditions
df['Maths_Spl Class'] = df['Maths'].apply(condition)
print(df)
# Import the library
import pandas as pd
# dataframe
df = pd.DataFrame({'Name': ['John', 'Jack', 'Shri',
'Krishna', 'Smith', 'Tessa'],
'Maths': [5, 3, 9, 10, 6, 3]})
# Defining all the conditions inside a function
def condition(x):
if x>8:
return "No need"
elif x>=5 and x<=7:
return "Hold decision"
else:
return 'Need'
# Applying the conditions
df['Maths_Spl Class'] = df['Maths'].apply(condition)
print(df)