Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

trend_level: TypeError resolved (issue #54) #64

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

Tanvi-Jain01
Copy link

@nipunbatra , @patel-zeel
This PR proposes solution for issue #54

BEFORE:

CODE:

file_path = r'C:\..\mydata.csv'

# Read the CSV file into a DataFrame
df = pd.read_csv(file_path)

#df.reset_index(inplace=True)
df['date'] = pd.to_datetime(df['date'])
print(df)

df_2003 = df[df['date'].dt.year == 2003]
print(df_2003)

from vayu.trendLevel import trendLevel
trendLevel(df_2003, 'pm25')

Error:

TypeError                                 Traceback (most recent call last)
Cell In[59], line 2
      1 from vayu.trendLevel import trendLevel
----> 2 trendLevel(df_2003, 'pm25')

File ~\anaconda3\lib\site-packages\vayu\trendLevel.py:45, in trendLevel(df, pollutant, **kwargs)
     34 t = pollutant_series_year.groupby(
     35     [pollutant_series_year.index.month, pollutant_series_year.index.hour]
     36 ).mean()
     37 two_d_array = t.values.reshape(12, 24).T
     38 sns.heatmap(
     39     two_d_array,
     40     cbar=True,
     41     linewidth=0,
     42     cmap="Spectral_r",
     43     vmin=0,
     44     vmax=400,
---> 45     ax=ax[i],
     46 )
     47 ax[i].set_title(year_string)
     48 ax[i].invert_yaxis()

TypeError: 'Axes' object is not subscriptable

IMPROVED CODE:

import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import datetime as dt
import matplotlib as mpl
import numpy as np
from numpy import vstack
from numpy import array

def trend_level(df:pd.DataFrame, pollutant:str, **kwargs):
    """
    Plot that shows the overall pollutant trend for every year in the 
    df. It takes the average hour value of each month and plots a heatmap
    showing what times of the year there is a high concentration of the 
    pollutant.

    Parameters
    ----------
    df: pd.DataFrame
        Data frame of complete data
    pollutant: str
        Name of the data series in df to produce plot.
    """
   

    df.index = pd.to_datetime(df.date)
    pollutant_series = df[pollutant]
    unique_years = np.unique(df.index.year)
    num_unique_years = len(unique_years)
    fig, ax = plt.subplots(nrows=num_unique_years, figsize=(20, 20))

    months = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul',
              'Aug', 'Sep', 'Oct', 'Nov', 'Dec']


    for i, year in enumerate(unique_years):
       
        year_string = str(year)
        pollutant_series_year = pollutant_series[year_string]
        t = pollutant_series_year.groupby(
            [pollutant_series_year.index.month, pollutant_series_year.index.hour]
        ).mean()
        two_d_array = t.values.reshape(12, 24).T
        heatmap_ax = ax[i] if num_unique_years > 1 else ax
        sns.heatmap(
            two_d_array,
            cbar=True,
            linewidth=0,
            cmap="Spectral_r",
            vmin=0,
            vmax=400,
            ax=heatmap_ax
        )
        heatmap_ax.set_xticklabels(months)
        heatmap_ax.set_ylabel("Hour of the Day")
        heatmap_ax.set_title(year_string)
        heatmap_ax.invert_yaxis()
        
    plt.savefig("TrendLevelPlot.png", bbox_inches="tight",dpi=300)
    print("Your plots has also been saved")
    plt.show()  # Display the plot

USAGE:

 df = pd.read_csv("mydata.csv")
 trend_level(df, 'pm25')

OUTPUT:
Trendlevelplot

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant