The visualization artifact Plotly plays with the violin chart

The visualization artifact Plotly plays with the violin chart

Visual artifact Plotly to play the violin chart

In the previous Plotly article, I have introduced how to use Plotly to make common visualization graphics such as histograms, scatter charts, pie charts, as well as K-line charts and box charts. This article introduces how to use Plotly to draw violin diagrams, also A statistical graph, based on two methods:

  • Based on plotly_express
  • Based on plotly.graph_objects

Violin illustration

Take a look at the actual effect of the violin chart drawn:

So what exactly is a violin chart?

Violin Plot is a graph used to display data distribution and probability density.

A website for learning visual graphics: datavizcatalogue.com/.

It combines the dual features of the box plot and density plot described earlier, and can be used to display the distribution shape of the data.

  • The thick black bar in the middle: indicates the range of quartiles
  • White point in the middle: indicates the median
  • Extended thin black line: represents the 95% confidence interval

Plotly series

Plotly's articles will form a serial series. The first 10 Plotly visualization articles are:

Import library, data

First import the required libraries:

Import numpy AS NP Import PANDAS AS pd Import plotly_express AS PX Import plotly.graph_objects AS Go Copy the code

The consumption data that comes with pandas used in this article:

# Uses consumption data tips tips = px.data.tips() tips.head() Copy code

Point-based violin chart

The first thing I draw is a violin chart based on data points:

fig = px.strip( tips, # Specified data x = 'day' , # xy axis y = 'total_bill' , color = 'day' # color ) fig.show() Copy code

Draw the violin chart according to the values of the 4 days:

Implementation based on Plotly_Express

Basic violin diagram

fig = px.violin(tips,y= "total_bill" ) # Use total_bill data fig.show() Copy code

Change a copy of the data and then plot:

fig = px.violin(tips,y= "tip" ) # Use tip fig.show() Copy code

Violin chart with data points

The above graph has no data points. The following is to display the data points next to the violin chart. The parameter is points:

fig = px.violin( tips, y= "total_bill" , box= True , # Draw a box diagram in the violin chart after opening it points= 'all' # all-all outliers-outliers False-do not display, default ) fig.show() Copy code

There are no outliers in this case, so points=outliers or False, the result is the same:

Grouped violin chart

Draw different violin diagrams through different values of the fields:

fig = px.violin( tips, y= "total_bill" , # plotted data x= "day" , color= "sex" , box = True , points= "all" , hover_data=tips.columns # Data information displayed on hover ) fig.show() Copy code

Covering type, group type violin chart

The two different graphics are mainly determined by the mode of the violin, the parameter used is violinmode:

fig = px.violin( tips, y = 'total_bill' , color = 'sex' , violinmode = 'overlay' , # overlay-overlay group-group hover_data=tips.columns ) fig.show() Copy code

fig = px.violin( tips, y = 'total_bill' , color = 'sex' , violinmode = 'group' , # overlay-overlay group -group hover_data=tips.columns ) fig.show() Copy code

Based on go.Violin

Basic violin diagram

fig = go.Figure(data=go.Violin( y=tips[ 'total_bill' ], # drawing data box_visible= True , # whether the inner box is displayed line_color = 'red' , # line color meanline_visible = True , # whether to display the midline fillcolor = 'seagreen' , # fill color opacity= 0.5 , # Transparency x0 = 'Tip-violin graph' # x-axis title )) fig.update_layout(yaxis_zeroline = False ) fig.show() Copy code

Multiple violin illustration

Draw multiple violins in one canvas at the same time. The day field in consumer data tips has 4 different values:

Through a for loop to traverse to achieve the drawing of 4 graphics:

fig1 = go.Figure() # Generate a Figure object # Add 4 traces trace for day in day_list on the object by looping : fig1.add_trace(go.Violin( x=tips[ "day" ][tips[ "day" ] == day], y=tips[ "total_bill" ][tips[ "day" ] == day], name=day, box_visible = True , meanline_visible = True )) fig1.show() Copy code

Grouped violin chart

python

fig2 = go.Figure() fig2.add_trace(go.Violin( x=tips[ tips[ 'sex' ] == 'Male' ][ 'day' ], # xy axis data of the drawing y=tips[ 'total_bill' ][ tips[ 'sex' ] == 'Male' ], # legendgroup=' ', # Legend grouping # scalegroup=' ', name= ' ' , # graph track name line_color = 'blue' # line color )) fig2.add_trace(go.Violin( x=tips[ 'day' ][tips[ 'sex' ] == 'Female' ], y=tips[ 'total_bill' ][tips[ 'sex' ] == 'Female' ], # legendgroup=' ', # scalegroup=' ', name= ' ' , line_color = 'orange' ) ) # Set whether the box and the median line show fig2.update_traces(box_visible = True , meanline_visible = True ) # Violin graph mode: overlay overlay type group group fig2.update_layout(violinmode = 'group' ) fig2.show() Copy code

Positive and negative violin diagram

In the violin picture, we can see that it is composed of two parts, which are divided into negative and positive. Different values will present different graphics:

import plotly.graph_objects as go import pandas as pd # How to use pandas to read the online csv file df = pd.read_csv( "https://raw.githubusercontent.com/plotly/datasets/master/violin_data.csv" ) fig = go.Figure() fig.add_trace(go.Violin( x=df[ 'day' ][ df[ 'smoker' ] == 'Yes' ], y=df[ 'total_bill' ][ df[ 'smoker' ] == 'Yes' ], # legendgroup='Yes', # scalegroup='Yes', name= 'Yes' , side = 'negative' , #'both-all','positive-right','negative-left' line_color = 'blue' ) ) fig.add_trace(go.Violin( x=df[ 'day' ][ df[ 'smoker' ] == 'No' ], y=df[ 'total_bill' ][ df[ 'smoker' ] == 'No' ], # legendgroup='No', # scalegroup='No', name= 'No' , side = 'positive' , line_color = 'lightseagreen' ) ) # Set the trajectory parameter fig.update_traces(meanline_visible= True , # Whether the median shows points= 'all' , # Whether to display points jitter= 0.05 , # Add jitter between each point, the visualization effect is better scalemode= ' count' ) #'width','count' fig.update_layout(violingap = 0 , violinmode = 'overlay' ) # Set interval and mode fig.show() Copy code

Senior Violin Illustration

Introduce two examples of advanced violin diagrams on the official website:

import plotly.graph_objects as go import pandas as pd df = pd.read_csv( "https://raw.githubusercontent.com/plotly/datasets/master/violin_data.csv" ) # Set the position of the point and the display of the legend pointpos_male = [- 0.9 ,- 1.1 ,- 0.6 ,- 0.3 ] pointpos_female = [ 0.45 , 0.55 , 1 , 0.4 ] show_legend = [ True , False , False , False ] fig = go.Figure() # pd.unique(df['day']): Represents the unique number of days for i in range ( 0 , len (pd.unique(df[ 'day' ]))): fig.add_trace(go.Violin( # Add two axis data x=df[ 'day' ][(df[ 'sex' ] == 'Male' ) & (df[ 'day' ] == pd.unique(df[ 'day' ]) [i])], y=df[ 'total_bill' ][(df[ 'sex' ] == 'Male' )&(df[ 'day' ] == pd.unique(df[ 'day' ])[i])], # Set legend and scale group, name legendgroup = 'M' , scalegroup = 'M' , name= 'M' , # Set the displayed data: negative left positive right side = 'negative' , pointpos=pointpos_male[i], # line_color = 'lightseagreen' , showlegend=show_legend[i]) ) fig.add_trace(go.Violin( x=df[ 'day' ][(df[ 'sex' ] == 'Female' ) &(df[ 'day' ] == pd.unique(df[ 'day' ])[i])], y=df[ 'total_bill' ][(df[ 'sex' ] == 'Female' )&(df[ 'day' ] == pd.unique(df[ 'day' ])[i])], legendgroup = 'F' , scalegroup = 'F' , name = 'F' , side = 'positive' , pointpos=pointpos_female[i], line_color = 'mediumpurple' , showlegend=show_legend[i]) ) # Set the trajectory parameter fig.update_traces(meanline_visible= True , # Whether the median shows points= 'all' , # Whether to display points jitter= 0.05 , # Add jitter between each point, the visualization effect is better scalemode= ' count' ) #'width','count' fig.update_layout( title_text= "Advanced Violin Drawing" , violingap = 0 , # The interval between the violin graphs violingroupgap = 0 , # The interval between the violin graph groups violinmode = 'overlay' # Overlay mode ) fig.show() Copy code

The other is an example of drawing Ridgeline (ridge type) Plots graphs, as an appreciation:

import plotly.graph_objects as go from plotly.colors import n_colors import numpy as np np.random.seed( 1 ) data = (np.linspace( 1 , 2 , 12 )[:, np.newaxis] * np.random.randn( 12 , 200 ) + (np.arange( 12 ) + 2 * np.random.random( 12 ))[:, np.newaxis]) print (data) #'rgb(5, 200, 200)','rgb(200, 10, 10)' represents the first and last color, 12 represents the number, colortype represents the type colors = n_colors( 'rgb( 5, 200, 200)' , 'rgb(200, 10, 10)' , 12 , colortype = 'rgb' ) print (colors) fig = go.Figure() for data_line, color in zip (data, colors): fig.add_trace(go.Violin(x=data_line, line_color=color)) fig.update_traces(orientation = 'h' , side = 'positive' , width = 3 , points = False ) fig.update_layout(xaxis_showgrid = False , xaxis_zeroline = False ) fig.show() Copy code