Skip to content

Pandas

Read from or write to Excel files

Read from an Excel file

shell
pip3 install openpyxl
pip3 install openpyxl

The pandas.read_excel API reads an Excel file into a pandas DataFrame.

python
import pandas as pd
df = pd.read_excel('Apple Income Statement.xlsx', sheet_name='sheet_0', header=4, index_col=0)
import pandas as pd
df = pd.read_excel('Apple Income Statement.xlsx', sheet_name='sheet_0', header=4, index_col=0)
  • header: default 0
    Row (0-indexed) to use for the column labels of the parsed DataFrame.
  • index_colint: default None
    Column (0-indexed) to use as the row labels of the DataFrame. Pass None if there is no such column.

Write to an Excel file

python
df.to_excel("path_to_file.xlsx",sheet_name="Sheet1", index=False)
df.to_excel("path_to_file.xlsx",sheet_name="Sheet1", index=False)

Select rows

Select by one column value

python
import numpy as np
df2 = df.iloc[np.array(df.is_first_launch==0)]
import numpy as np
df2 = df.iloc[np.array(df.is_first_launch==0)]

Sort rows

The sort_values API can sort the dataframe by the values along either axis.

python
sorted_df = df.sort_values(by=['launch_time'], ascending=True)
sorted_df = df.sort_values(by=['launch_time'], ascending=True)
  • axis: default 0
    Axis to be sorted, 0 or index for sorting rows, 1 or columns for sorting columns.
  • by:
    Name or list of names to sort by.

Add a column by applying a lambda

python
# Import the library
import pandas as pd
 
# dataframe
df = pd.DataFrame({'Name': ['John', 'Jack', 'Shri',
                            'Krishna', 'Smith', 'Tessa'],
                   'Maths': [5, 3, 9, 10, 6, 3]})
 
# Adding the result column
df['Result'] = df['Maths'].apply(lambda x: 'Pass' if x>=5 else 'Fail')
 
print(df)
# Import the library
import pandas as pd
 
# dataframe
df = pd.DataFrame({'Name': ['John', 'Jack', 'Shri',
                            'Krishna', 'Smith', 'Tessa'],
                   'Maths': [5, 3, 9, 10, 6, 3]})
 
# Adding the result column
df['Result'] = df['Maths'].apply(lambda x: 'Pass' if x>=5 else 'Fail')
 
print(df)
python


# Import the library
import pandas as pd
 
# dataframe
df = pd.DataFrame({'Name': ['John', 'Jack', 'Shri', 
                            'Krishna', 'Smith', 'Tessa'],
                   'Maths': [5, 3, 9, 10, 6, 3]})
 
# Defining all the conditions inside a function
def condition(x):
    if x>8:
        return "No need"
    elif x>=5 and x<=7:
        return "Hold decision"
    else:
        return 'Need'
 
# Applying the conditions
df['Maths_Spl Class'] = df['Maths'].apply(condition)
 
print(df)


# Import the library
import pandas as pd
 
# dataframe
df = pd.DataFrame({'Name': ['John', 'Jack', 'Shri', 
                            'Krishna', 'Smith', 'Tessa'],
                   'Maths': [5, 3, 9, 10, 6, 3]})
 
# Defining all the conditions inside a function
def condition(x):
    if x>8:
        return "No need"
    elif x>=5 and x<=7:
        return "Hold decision"
    else:
        return 'Need'
 
# Applying the conditions
df['Maths_Spl Class'] = df['Maths'].apply(condition)
 
print(df)

References