# Original source: Tuoduan Data Tribe Official Account

One of the most important tasks in the financial market is to analyze the historical returns of various investments. To perform this analysis, we need historical data of the asset. There are many data providers, some are free, and most are paid. In this article, we will use data from the Yahoo Finance website.

2. Calculate the rate of return
3. Calculate the mean and standard deviation of returns

Let's load the library first.

```library(tidyquant)
library(timetk)
Copy code```

We will get the closing price of the Netflix price.

```netflix <- tq_get("NFLX",
from = '2009-01-01',
to = "2018-03-01",
get = "stock.prices")
Copy code```

Next, we will plot the adjusted closing price of Netflix.

```netflix %>%
ggplot(aes(x = date, y = adjusted)) +
geom_line() +
ggtitle("Netflix since 2009") +
labs(x = "Date", "Price") +
scale_x_date(date_breaks = "years", date_labels = "%Y") +
labs(x = "Date", y = "Adjusted Price") +
theme_bw()
Copy code```

### Calculate the daily and monthly yields of a single stock

Once we have downloaded the closing price from Yahoo Finance, the next step is to calculate the revenue. We will use the tidyquant package again for calculations. We have downloaded Netflix's price data above, if you haven't downloaded it yet, please see the section above.

```# Calculate daily income

netflix_daily_returns <- netflix %>%
tq_transmute(select = adjusted, this specifies the column to be selected
mutate_fun = periodReturn, # This specifies how to process the column
period = "daily", # This parameter calculates daily income
col_rename = "nflx_returns") # Rename column

#Calculate monthly income
netflix_monthly_returns <- netflix %>%
mutate_fun = periodReturn,
period = "monthly", # This parameter calculates monthly income
col_rename = "nflx_returns")
Copy code```

### Chart Netflix's daily and monthly revenue

```# We will use the line chart to get daily income

ggplot(aes(x = date, y = nflx_returns)) +
geom_line() +
theme_classic() +

Copy code```

After looking at Netflix's daily earnings chart, we can conclude that earnings fluctuate greatly, and the stock can fluctuate by +/- 5% on any given day. In order to understand the distribution of returns, we can draw a histogram.

```netflix_daily_returns %>%
ggplot(aes(x = nflx_returns)) +
geom_histogram(binwidth = 0.015) +
theme_classic() +

Copy code```

Next, we can plot the monthly yield of Netflix since 2009. We use bar graphs to plot the data.

```# Draw a chart of Netflix's monthly revenue. Use bar chart

ggplot(aes(x = date, y = nflx_returns)) +
geom_bar(stat = "identity") +
theme_classic() +
Copy code```

### Calculate the cumulative earnings of Netflix stock

Plotting daily and monthly returns is useful for understanding the daily and monthly fluctuations of investments. To calculate the growth of an investment, in other words, to calculate the total return of an investment, we need to calculate the cumulative return of that investment. To calculate the cumulative return, we will use the  cumprod()  function.

```
mutate(cr = cumprod(1 + nflx_returns)) %>% # Use cumprod function

Copy code```
```
ggplot(aes(x = date, y = cumulative_returns)) +
geom_line() +
theme_classic() +

Copy code```

This chart shows the cumulative revenue of Netflix since 2009. With the power of post-mortem analysis, since 2009, you can earn \$85 with a \$1 investment. But as far as we know, it's easier said than done. In about 10 years, the investment lost 50% of its value during the Qwickster fiasco . During this period, few investors were able to insist on investing.

```  ggplot(aes(x = date, y = cumulative_returns)) +
geom_line() +
theme_classic() +
Copy code```

We can intuitively see that the monthly income statement is much smoother than the daily chart.

## Multiple stocks

### Download stock market data for multiple stocks.

```#Set our stock code as a variable

tickers <- c("FB", "AMZN", "AAPL", "NFLX", "GOOG")

multpl_stocks <- tq_get(tickers,
Copy code```

### Draw stock price charts for multiple stocks

Next, we will draw a price chart of multiple stocks

```multpl_stocks %>%
ggplot(aes(x = date, y = adjusted,
Copy code```

This is not the result we expected. Since these stocks have huge price differences (FB is lower than 165, AMZN is higher than 1950), they are of different sizes. We can overcome this problem by plotting the stocks in their respective y scales.

```
facet_wrap(~symbol, scales = "free_y") + # facet_wrap is used to make different pages
theme_classic() +
Copy code```

# Calculate the return of multiple stocks

Calculating the return of multiple stocks is as easy as a single stock. Only one additional parameter needs to be passed here. We need to use the parameter  group_by (symbol)  to calculate the return of a single stock.

```#Calculate the daily return of multiple stocks

mutate_fun = periodReturn,
period ='daily',
col_rename ='returns')

#Calculate the monthly income of multiple stocks

mutate_fun = periodReturn,
period ='monthly',
col_rename ='returns')
Copy code```

### Chart the returns of multiple stocks

Once you have the earnings calculation, you can plot the earnings on the chart.

```multpl_stock_daily_returns %>%
ggplot(aes(x = date, y = returns)) +
geom_line() +
geom_hline(yintercept = 0) +
Copy code```

```multpl_stock_monthly_returns %>%
ggplot(aes(x = date, y = return
scale_fill_brewer(palette = "Set1", # we will give them a different color instead of black

Copy code```

Among FAANG stocks, Apple has the least volatility, while Facebook and Netflix have the most volatility. For the business they are engaged in, this is obvious. Apple is a stable company with stable cash flow. Its products are loved and used by millions of people, and they have great loyalty to Apple. Netflix and Facebook are also incredible businesses, but they are in a high-growth stage, and any problems (declining revenue or user growth) can have a significant impact on the stock.

### Calculate the cumulative return of multiple stocks

Usually, we want to see which investment produced the best results in the past. For this, we can calculate the cumulative result. Below we compare the investment results of all FAANG stocks since 2013. Which is the best investment since 2013?

```multpl_stock_monthly_returns %>%
mutate(returns e_returns = cr-1) %>%
ggplot(aes(x = date, y = cumulative_returns, color = symbol)) +
geom_line() +
labs(x = "Date"
Copy code```

Not surprisingly, Netflix has had the highest revenue since 2013. Amazon and Facebook ranked second and third.

# Statistical data

### Calculate the mean and standard deviation of a single stock

We already have Netflix's daily and monthly revenue data. Now we will calculate the daily and monthly averages and standard deviations of earnings. For this, we will use the  mean()  and  sd() functions.

```# Calculate the average

.[] %>%
mean(na.rm = TRUE)

nflx_monthly_mean_ret <- netfl turns) %>%
.[] %>%
mean(na.rm = TRUE)

# Calculate the standard deviation

nflx_daily_sd_ret <- netflirns) %>%
.[] %>%
sd()

nflx_monthly_sd_ret <- netflix_rns) %>%
.[] %>%
sd()
nflx_stat
Copy code```
```## # A tibble: 2 x 3
## period mean sd
## <chr> <dbl> <dbl>
## 1 Daily 0.00240 0.0337
## 2 Monthly 0.0535 0.176
Copy code```

We can see that the average daily revenue of Netflix is 0.2% and the standard deviation is 3.3%. Its average monthly rate of return is 5.2% and a standard deviation of 17%. The data is for the entire period since 2009. What if we want to calculate the mean and standard deviation for each year. We can calculate by grouping Netflix revenue data by year and performing calculations.

```netflix %>%
summarise(Monthly_Mean_Returns = mean(nflx_returns),
MOnthly_Standard_Deviation = sd(nflx_returns)
Copy code```
```## # A tibble: 10 x 3
## year Monthly_Mean_Returns MOnthly_Standard_Deviation
## <dbl> <dbl> <dbl>
## 1 2009 0.0566 0.0987
## 2 2010 0.110 0.142
## 3 2011 -0.0492 0.209
## 4 2012 0.0562 0.289
## 5 2013 0.137 0.216
## 6 2014 0.00248 0.140
## 7 2015 0.0827 0.148
## 8 2016 0.0138 0.126
## 9 2017 0.0401 0.0815
## 10 2018 0.243 0.233
Copy code```

We can also plot the results to understand better.

```netflix_monthly_returns %>%
mutate(year = rns, Standard_Deviation, keyistic)) +
geom_bar(stat = "identity", position = "dodge") +
scale_y_continuous(b) +
theme_bw() +
Copy code```

We can see that since 2009, monthly returns and standard deviations have fluctuated greatly. In 2011, the average monthly return was -5%.

### Calculate the mean and standard deviation of multiple stocks

Next, we can calculate the mean and standard deviation of multiple stocks.

```
group_by(symbol) %>%
summarise(mean = mean(returns),
sd = sd(returns))
Copy code```
```## # A tibble: 5 x 3
## symbol mean sd
## <chr> <dbl> <dbl>
## 1 AAPL 0.00100 0.0153
## 2 AMZN 0.00153 0.0183
## 3 FB 0.00162 0.0202
## 4 GOOG 0.000962 0.0141
## 5 NFLX 0.00282 0.0300
Copy code```
```group_by(symbol) %>%
summarise(mean = mean(returns),
sd = sd(returns))
Copy code```
```## # A tibble: 5 x 3
## symbol mean sd
## <chr> <dbl> <dbl>
## 1 AAPL 0.0213 0.0725
## 2 AMZN 0.0320 0.0800
## 3 FB 0.0339 0.0900
## 4 GOOG 0.0198 0.0568
## 5 NFLX 0.0614 0.157
Copy code```

Calculate the annual mean and standard deviation of returns.

```   %>%
group_by(symbol, year) %>%
summarise(mean = mean(returns),
sd = sd(returns))
Copy code```
```## # A tibble: 30 x 4
## # Groups: symbol [?]
## symbol year mean sd
## <chr> <dbl> <dbl> <dbl>
## 1 AAPL 2013 0.0210 0.0954
## 2 AAPL 2014 0.0373 0.0723
## 3 AAPL 2015 -0.000736 0.0629
## 4 AAPL 2016 0.0125 0.0752
## 5 AAPL 2017 0.0352 0.0616
## 6 AAPL 2018 0.0288 0.0557
## 7 AMZN 2013 0.0391 0.0660
## 8 AMZN 2014 -0.0184 0.0706
## 9 AMZN 2015 0.0706 0.0931
## 10 AMZN 2016 0.0114 0.0761
## # ... with 20 more rows
Copy code```

We can also plot this statistic.

```multpl_stock_monthly_returns %>%
mutate(year = year(date)) %>%
group_by(symbol, yea s = seq(-0.1,0.4,0.02),
labels = scales::percent) +
scale_x_continuous(breaks = seq(2009,2018,1)) +
labs(x = "Year", y = Stocks") +
ggtitle
Copy code```

```multpl_stock_monthly_returns %>%
mutate(year = year(date)) %>%
ggplot(aes(x = year, y = sd, fill = symbol)) +
geom_bar(stat = "identity", position = "dodge", width = 0.7) +
scale_y_continuous(breaks = seq(-0.1,0.4,0.02),
labels = scales::p
scale_fill_brewer(palette = "Set1",
Copy code```

### Calculate the covariance and correlation of multiple stocks

Another important statistical calculation is the correlation and covariance of stocks. In order to calculate these statistics, we need to modify the data. We convert it into an xts object.

Covariance table

```#Calculate the covariance

tk_xts(silent = TRUE) %>%
cov()
Copy code```
```## AAPL AMZN FB GOOG NFLX
## AAPL 5.254736e-03 0.001488462 0.000699818 0.0007420307 -1.528193e-05
## AMZN 1.488462e-03 0.006399439 0.001418561 0.0028531565 4.754894e-03
## FB 6.998180e-04 0.001418561 0.008091594 0.0013566480 3.458228e-03
## GOOG 7.420307e-04 0.002853157 0.001356648 0.0032287790 3.529245e-03
## NFLX -1.528193e-05 0.004754894 0.003458228 0.0035292451 2.464202e-02
Copy code```

Related table

```# Calculate the correlation coefficient

%>%
tk_xts(silent = TRUE) %>%
cor()
Copy code```
```## AAPL AMZN FB GOOG NFLX
## AAPL 1.000000000 0.2566795 0.1073230 0.1801471 -0.001342964
## AMZN 0.256679539 1.0000000 0.1971334 0.6276759 0.378644485
## FB 0.107322952 0.1971334 1.0000000 0.2654184 0.244905437
## GOOG 0.180147089 0.6276759 0.2654184 1.0000000 0.395662114
## NFLX -0.001342964 0.3786445 0.2449054 0.3956621 1.000000000
Copy code```

We can use

corrplot()
Package to draw correlation matrix diagram.

```## corrplot 0.84 loaded
copy the code```
```
cor() %>%
corrplot()
Copy code```

Most popular insights

6. An example of using R language to realize neural network prediction of stocks