Statistical Visual Analysis of Stock Market Income Based on R Language

Statistical Visual Analysis of Stock Market Income Based on R Language

Original link: tecdat.cn/?p=16453 

Original source: Tuoduan Data Tribe Official Account

 

One of the most important tasks in the financial market is to analyze the historical returns of various investments. To perform this analysis, we need historical data of the asset. There are many data providers, some are free, and most are paid. In this article, we will use data from the Yahoo Finance website.

In this article, we will:

  1. Download closing price
  2. Calculate the rate of return
  3. Calculate the mean and standard deviation of returns

Let's load the library first.

library(tidyquant) library(timetk) Copy code

We will get the closing price of the Netflix price.

netflix <- tq_get("NFLX", from = '2009-01-01', to = "2018-03-01", get = "stock.prices") Copy code

Next, we will plot the adjusted closing price of Netflix.

netflix %>% ggplot(aes(x = date, y = adjusted)) + geom_line() + ggtitle("Netflix since 2009") + labs(x = "Date", "Price") + scale_x_date(date_breaks = "years", date_labels = "%Y") + labs(x = "Date", y = "Adjusted Price") + theme_bw() Copy code

 

Calculate the daily and monthly yields of a single stock

Once we have downloaded the closing price from Yahoo Finance, the next step is to calculate the revenue. We will use the tidyquant package again for calculations. We have downloaded Netflix's price data above, if you haven't downloaded it yet, please see the section above.

# Calculate daily income netflix_daily_returns <- netflix %>% tq_transmute(select = adjusted, this specifies the column to be selected mutate_fun = periodReturn, # This specifies how to process the column period = "daily", # This parameter calculates daily income col_rename = "nflx_returns") # Rename column #Calculate monthly income netflix_monthly_returns <- netflix %>% tq_transmute(select = adjusted, mutate_fun = periodReturn, period = "monthly", # This parameter calculates monthly income col_rename = "nflx_returns") Copy code

Chart Netflix's daily and monthly revenue

# We will use the line chart to get daily income ggplot(aes(x = date, y = nflx_returns)) + geom_line() + theme_classic() + Copy code

After looking at Netflix's daily earnings chart, we can conclude that earnings fluctuate greatly, and the stock can fluctuate by +/- 5% on any given day. In order to understand the distribution of returns, we can draw a histogram.

netflix_daily_returns %>% ggplot(aes(x = nflx_returns)) + geom_histogram(binwidth = 0.015) + theme_classic() + Copy code

Next, we can plot the monthly yield of Netflix since 2009. We use bar graphs to plot the data.

# Draw a chart of Netflix's monthly revenue. Use bar chart ggplot(aes(x = date, y = nflx_returns)) + geom_bar(stat = "identity") + theme_classic() + Copy code

 

Calculate the cumulative earnings of Netflix stock

Plotting daily and monthly returns is useful for understanding the daily and monthly fluctuations of investments. To calculate the growth of an investment, in other words, to calculate the total return of an investment, we need to calculate the cumulative return of that investment. To calculate the cumulative return, we will use the  cumprod()  function.

mutate(cr = cumprod(1 + nflx_returns)) %>% # Use cumprod function Copy code
ggplot(aes(x = date, y = cumulative_returns)) + geom_line() + theme_classic() + Copy code

This chart shows the cumulative revenue of Netflix since 2009. With the power of post-mortem analysis, since 2009, you can earn $85 with a $1 investment. But as far as we know, it's easier said than done. In about 10 years, the investment lost 50% of its value during the Qwickster fiasco . During this period, few investors were able to insist on investing.

ggplot(aes(x = date, y = cumulative_returns)) + geom_line() + theme_classic() + Copy code

We can intuitively see that the monthly income statement is much smoother than the daily chart.

Multiple stocks

Download stock market data for multiple stocks.

#Set our stock code as a variable tickers <- c("FB", "AMZN", "AAPL", "NFLX", "GOOG") # Download stock price data multpl_stocks <- tq_get(tickers, Copy code

Draw stock price charts for multiple stocks

Next, we will draw a price chart of multiple stocks

multpl_stocks %>% ggplot(aes(x = date, y = adjusted, Copy code

 

This is not the result we expected. Since these stocks have huge price differences (FB is lower than 165, AMZN is higher than 1950), they are of different sizes. We can overcome this problem by plotting the stocks in their respective y scales.

facet_wrap(~symbol, scales = "free_y") + # facet_wrap is used to make different pages theme_classic() + Copy code

Calculate the return of multiple stocks

Calculating the return of multiple stocks is as easy as a single stock. Only one additional parameter needs to be passed here. We need to use the parameter  group_by (symbol)  to calculate the return of a single stock.

#Calculate the daily return of multiple stocks tq_transmute(select = adjusted, mutate_fun = periodReturn, period ='daily', col_rename ='returns') #Calculate the monthly income of multiple stocks tq_transmute(select = adjusted, mutate_fun = periodReturn, period ='monthly', col_rename ='returns') Copy code

Chart the returns of multiple stocks

Once you have the earnings calculation, you can plot the earnings on the chart.

multpl_stock_daily_returns %>% ggplot(aes(x = date, y = returns)) + geom_line() + geom_hline(yintercept = 0) + Copy code

 

multpl_stock_monthly_returns %>% ggplot(aes(x = date, y = return scale_fill_brewer(palette = "Set1", # we will give them a different color instead of black Copy code

Among FAANG stocks, Apple has the least volatility, while Facebook and Netflix have the most volatility. For the business they are engaged in, this is obvious. Apple is a stable company with stable cash flow. Its products are loved and used by millions of people, and they have great loyalty to Apple. Netflix and Facebook are also incredible businesses, but they are in a high-growth stage, and any problems (declining revenue or user growth) can have a significant impact on the stock.

Calculate the cumulative return of multiple stocks

Usually, we want to see which investment produced the best results in the past. For this, we can calculate the cumulative result. Below we compare the investment results of all FAANG stocks since 2013. Which is the best investment since 2013?

multpl_stock_monthly_returns %>% mutate(returns e_returns = cr-1) %>% ggplot(aes(x = date, y = cumulative_returns, color = symbol)) + geom_line() + labs(x = "Date" Copy code

 

Not surprisingly, Netflix has had the highest revenue since 2013. Amazon and Facebook ranked second and third.

Statistical data

Calculate the mean and standard deviation of a single stock

We already have Netflix's daily and monthly revenue data. Now we will calculate the daily and monthly averages and standard deviations of earnings. For this, we will use the  mean()  and  sd() functions.

# Calculate the average .[[1]] %>% mean(na.rm = TRUE) nflx_monthly_mean_ret <- netfl turns) %>% .[[1]] %>% mean(na.rm = TRUE) # Calculate the standard deviation nflx_daily_sd_ret <- netflirns) %>% .[[1]] %>% sd() nflx_monthly_sd_ret <- netflix_rns) %>% .[[1]] %>% sd() nflx_stat Copy code
## # A tibble: 2 x 3 ## period mean sd ## <chr> <dbl> <dbl> ## 1 Daily 0.00240 0.0337 ## 2 Monthly 0.0535 0.176 Copy code

We can see that the average daily revenue of Netflix is 0.2% and the standard deviation is 3.3%. Its average monthly rate of return is 5.2% and a standard deviation of 17%. The data is for the entire period since 2009. What if we want to calculate the mean and standard deviation for each year. We can calculate by grouping Netflix revenue data by year and performing calculations.

netflix %>% summarise(Monthly_Mean_Returns = mean(nflx_returns), MOnthly_Standard_Deviation = sd(nflx_returns) Copy code
## # A tibble: 10 x 3 ## year Monthly_Mean_Returns MOnthly_Standard_Deviation ## <dbl> <dbl> <dbl> ## 1 2009 0.0566 0.0987 ## 2 2010 0.110 0.142 ## 3 2011 -0.0492 0.209 ## 4 2012 0.0562 0.289 ## 5 2013 0.137 0.216 ## 6 2014 0.00248 0.140 ## 7 2015 0.0827 0.148 ## 8 2016 0.0138 0.126 ## 9 2017 0.0401 0.0815 ## 10 2018 0.243 0.233 Copy code

We can also plot the results to understand better.

netflix_monthly_returns %>% mutate(year = rns, Standard_Deviation, keyistic)) + geom_bar(stat = "identity", position = "dodge") + scale_y_continuous(b) + theme_bw() + Copy code

 

We can see that since 2009, monthly returns and standard deviations have fluctuated greatly. In 2011, the average monthly return was -5%.

Calculate the mean and standard deviation of multiple stocks

Next, we can calculate the mean and standard deviation of multiple stocks.

group_by(symbol) %>% summarise(mean = mean(returns), sd = sd(returns)) Copy code
## # A tibble: 5 x 3 ## symbol mean sd ## <chr> <dbl> <dbl> ## 1 AAPL 0.00100 0.0153 ## 2 AMZN 0.00153 0.0183 ## 3 FB 0.00162 0.0202 ## 4 GOOG 0.000962 0.0141 ## 5 NFLX 0.00282 0.0300 Copy code
group_by(symbol) %>% summarise(mean = mean(returns), sd = sd(returns)) Copy code
## # A tibble: 5 x 3 ## symbol mean sd ## <chr> <dbl> <dbl> ## 1 AAPL 0.0213 0.0725 ## 2 AMZN 0.0320 0.0800 ## 3 FB 0.0339 0.0900 ## 4 GOOG 0.0198 0.0568 ## 5 NFLX 0.0614 0.157 Copy code

Calculate the annual mean and standard deviation of returns.

%>% group_by(symbol, year) %>% summarise(mean = mean(returns), sd = sd(returns)) Copy code
## # A tibble: 30 x 4 ## # Groups: symbol [?] ## symbol year mean sd ## <chr> <dbl> <dbl> <dbl> ## 1 AAPL 2013 0.0210 0.0954 ## 2 AAPL 2014 0.0373 0.0723 ## 3 AAPL 2015 -0.000736 0.0629 ## 4 AAPL 2016 0.0125 0.0752 ## 5 AAPL 2017 0.0352 0.0616 ## 6 AAPL 2018 0.0288 0.0557 ## 7 AMZN 2013 0.0391 0.0660 ## 8 AMZN 2014 -0.0184 0.0706 ## 9 AMZN 2015 0.0706 0.0931 ## 10 AMZN 2016 0.0114 0.0761 ## # ... with 20 more rows Copy code

We can also plot this statistic.

multpl_stock_monthly_returns %>% mutate(year = year(date)) %>% group_by(symbol, yea s = seq(-0.1,0.4,0.02), labels = scales::percent) + scale_x_continuous(breaks = seq(2009,2018,1)) + labs(x = "Year", y = Stocks") + ggtitle Copy code

 

multpl_stock_monthly_returns %>% mutate(year = year(date)) %>% ggplot(aes(x = year, y = sd, fill = symbol)) + geom_bar(stat = "identity", position = "dodge", width = 0.7) + scale_y_continuous(breaks = seq(-0.1,0.4,0.02), labels = scales::p scale_fill_brewer(palette = "Set1", Copy code

 

Calculate the covariance and correlation of multiple stocks

Another important statistical calculation is the correlation and covariance of stocks. In order to calculate these statistics, we need to modify the data. We convert it into an xts object.

Covariance table

#Calculate the covariance tk_xts(silent = TRUE) %>% cov() Copy code
## AAPL AMZN FB GOOG NFLX ## AAPL 5.254736e-03 0.001488462 0.000699818 0.0007420307 -1.528193e-05 ## AMZN 1.488462e-03 0.006399439 0.001418561 0.0028531565 4.754894e-03 ## FB 6.998180e-04 0.001418561 0.008091594 0.0013566480 3.458228e-03 ## GOOG 7.420307e-04 0.002853157 0.001356648 0.0032287790 3.529245e-03 ## NFLX -1.528193e-05 0.004754894 0.003458228 0.0035292451 2.464202e-02 Copy code

Related table

# Calculate the correlation coefficient %>% tk_xts(silent = TRUE) %>% cor() Copy code
## AAPL AMZN FB GOOG NFLX ## AAPL 1.000000000 0.2566795 0.1073230 0.1801471 -0.001342964 ## AMZN 0.256679539 1.0000000 0.1971334 0.6276759 0.378644485 ## FB 0.107322952 0.1971334 1.0000000 0.2654184 0.244905437 ## GOOG 0.180147089 0.6276759 0.2654184 1.0000000 0.395662114 ## NFLX -0.001342964 0.3786445 0.2449054 0.3956621 1.000000000 Copy code

 

We can use

corrplot()
 Package to draw correlation matrix diagram.

## corrplot 0.84 loaded copy the code
cor() %>% corrplot() Copy code


Most popular insights

1. Use machine learning to recognize changing stock market conditions-the application of Hidden Markov Model (HMM)

2. R language GARCH-DCC model and DCC (MVT) modeling estimation

3. Case analysis report of Copula algorithm modeling dependency in R language

4. R language COPULAS and VaR analysis of financial time series data

5. R language multivariate COPULA GARCH model time series forecasting

6. An example of using R language to realize neural network prediction of stocks

7. The realization of r language forecasting volatility: ARCH model and HAR-RV model

8. How to make markov switching model in R language

9. Matlab uses Copula simulation to optimize market risk