Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Blog - line plot #23

Open
wants to merge 5 commits into
base: gh-pages
Choose a base branch
from
Open

Blog - line plot #23

wants to merge 5 commits into from

Conversation

kdorr
Copy link
Collaborator

@kdorr kdorr commented Aug 15, 2018

This is the line plot section from the blog post Nabarun and I were working on.

Copy link
Member

@story645 story645 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as #22, maybe add a line of code showing use of the 'ax' obejct.

# Making a Line Plot
Altair works best with [long-form](https://altair-viz.github.io/user_guide/data.html#long-form-vs-wide-form-data) data. This is where each row contains a single observation along with all of its metadata stored as values.

Matplotlib works a little better with wide-form data.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the rational behind this statement?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was just my impression after converting so many Altair charts to Matplotlib by hand. It seems like the long-form data usually has to be reformatted into something that more resembles wide-form data for Matplotlib whenever things get slightly complicated (eg scatter plots that have categorical colors and line plots like the one in this post).

Is it incorrect to say that mpl works a little better with wide-form data? Would it be better to just leave that statement out?

1 | 2 | d
2 | 9 | d

A possible scenario for this dataset would be an experiment being run in several different locations with 2 measurements taken at each location. The goal with the visualization being to visualize how the amount changed between the two sets of measurements at each location.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goal of the visualization


## Altair
If we want to plot lines to show how each location changed between set one and set two,
we need to specify the data, tell Altair to plot lines with `mark_line()`, link the x
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe turn into a list:

  1. Specificy the data: 'alt.chart(df)
  2. Tell Altair to plot lines with 'mark_lines
  3. Link the encoding channels:
  • X with 'set'
  • y with 'amount'
    Etc

![png](pics/altair-to-mpl-line_0.png)

## Matplotlib
In Matplotlib, just like with a categorical scatter plot, we have to plot a new line for every location.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Link to cat scatter post if you're gonna make a reference to it


## Matplotlib
In Matplotlib, just like with a categorical scatter plot, we have to plot a new line for every location.
Specifying a label with each line allows us to generate a legend with `ax.legend()`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Label for each line

The reference to the other post is untested.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants