導航:首頁 > 數據行情 > r語言股票數據回歸分析案例

r語言股票數據回歸分析案例

發布時間：2022-08-28 14:31:23

1. R語言基本數據分析

R語言基本數據分析
本文基於R語言進行基本數據統計分析，包括基本作圖，線性擬合，邏輯回歸，bootstrap采樣和Anova方差分析的實現及應用。
不多說，直接上代碼，代碼中有注釋。
1. 基本作圖（盒圖，qq圖）
#basic plot
boxplot(x)
qqplot(x,y)
2. 線性擬合
#linear regression
n = 10
x1 = rnorm(n)#variable 1
x2 = rnorm(n)#variable 2
y = rnorm(n)*3
mod = lm(y~x1+x2)
model.matrix(mod) #erect the matrix of mod
plot(mod) #plot resial and fitted of the solution, Q-Q plot and cook distance
summary(mod) #get the statistic information of the model
hatvalues(mod) #very important, for abnormal sample detection
3. 邏輯回歸

#logistic regression
x <- c(0, 1, 2, 3, 4, 5)
y <- c(0, 9, 21, 47, 60, 63) # the number of successes
n <- 70 #the number of trails
z <- n - y #the number of failures
b <- cbind(y, z) # column bind
fitx <- glm(b~x,family = binomial) # a particular type of generalized linear model
print(fitx)

plot(x,y,xlim=c(0,5),ylim=c(0,65)) #plot the points (x,y)

beta0 <- fitx$coef[1]
beta1 <- fitx$coef[2]
fn <- function(x) n*exp(beta0+beta1*x)/(1+exp(beta0+beta1*x))
par(new=T)
curve(fn,0,5,ylim=c(0,60)) # plot the logistic regression curve
3. Bootstrap采樣

# bootstrap
# Application: 隨機采樣，獲取最大eigenvalue占所有eigenvalue和之比，並畫圖顯示distribution
dat = matrix(rnorm(100*5),100,5)
no.samples = 200 #sample 200 times
# theta = matrix(rep(0,no.samples*5),no.samples,5)
theta =rep(0,no.samples*5);
for (i in 1:no.samples)
{
j = sample(1:100,100,replace = TRUE)#get 100 samples each time
datrnd = dat[j,]; #select one row each time
lambda = princomp(datrnd)$sdev^2; #get eigenvalues
# theta[i,] = lambda;
theta[i] = lambda[1]/sum(lambda); #plot the ratio of the biggest eigenvalue
}

# hist(theta[1,]) #plot the histogram of the first(biggest) eigenvalue
hist(theta); #plot the percentage distribution of the biggest eigenvalue
sd(theta)#standard deviation of theta

#上面注釋掉的語句，可以全部去掉注釋並將其下一條語句注釋掉，完成畫最大eigenvalue分布的功能
4. ANOVA方差分析

#Application：判斷一個自變數是否有影響 (假設我們喂3種維他命給3頭豬，想看喂維他命有沒有用)
#
y = rnorm(9); #weight gain by pig(Yij, i is the treatment, j is the pig_id), 一般由用戶自行輸入
#y = matrix(c(1,10,1,2,10,2,1,9,1),9,1)
Treatment <- factor(c(1,2,3,1,2,3,1,2,3)) #each {1,2,3} is a group
mod = lm(y~Treatment) #linear regression
print(anova(mod))
#解釋：Df（degree of freedom）
#Sum Sq: deviance (within groups, and resials) 總偏差和
# Mean Sq: variance (within groups, and resials) 平均方差和
# compare the contribution given by Treatment and Resial
#F value: Mean Sq(Treatment)/Mean Sq(Resials)
#Pr(>F): p-value. 根據p-value決定是否接受Hypothesis H0：多個樣本總體均數相等(檢驗水準為0.05)
qqnorm(mod$resial) #plot the resial approximated by mod
#如果qqnorm of resial像一條直線，說明resial符合正態分布，也就是說Treatment帶來的contribution很小，也就是說Treatment無法帶來收益（多喂維他命少喂維他命沒區別）
如下面兩圖分別是
（左）用 y = matrix(c(1,10,1,2,10,2,1,9,1),9,1)和
（右）y = rnorm(9);
的結果。可見如果給定豬吃維他命2後體重特別突出的數據結果後，qq圖種resial不在是一條直線，換句話說resial不再符合正態分布，i.e., 維他命對豬的體重有影響。

2. R語言實現一個回歸

x<-c(1,2,3)
y<-c(3,4,5)
a<-data.frame(x,y)
a_lm<-lm(y~x,data=a)
> summary(a_lm)

Call:
lm(formula = y ~ x, data = a)

Resials:
1 2 3
-7.020e-17 1.404e-16 -7.020e-17

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.000e+00 2.627e-16 7.614e+15 <2e-16 ***
x 1.000e+00 1.216e-16 8.224e+15 <2e-16 ***
---
Signif. codes: 0 『***』 0.001 『**』 0.01 『*』 0.05 『.』 0.1 『』 1

Resial standard error: 1.72e-16 on 1 degrees of freedom
Multiple R-squared: 1, Adjusted R-squared: 1
F-statistic: 6.764e+31 on 1 and 1 DF, p-value: < 2.2e-16

3. 如何用R語言的quantmod包獲取一系列股票的歷史日線數據

我舉個例子供你參考：
> install.packages('quantmod') # 安裝安裝quantmod包
> require(quantmod)#引用quantmod包
> getSymbols("GOOG",src="yahoo",from="2013-01-01", to='2013-04-24') #從雅虎財經獲取google的股票數據
> chartSeries(GOOG,up.col='red',dn.col='green') #顯示K線圖

4. 如何用R語言做線性相關回歸分析

cor()函數可以提供雙變數之間的相關系數，還可以用scatterplotMatrix()函數生成散點圖矩陣

不過R語言沒有直接給出偏相關的函數；
我們要是做的話，要先調用cor.test()對變數進行Pearson相關性分析，
得到簡單相關系數，然後做t檢驗，判斷顯著性。

5. r語言邏輯回歸實例自變數有多個

輯回歸是回歸模型，其中響應變數(因變數)具有明確的值，如：True/False或0/1。它實際測量二元響應作為響應變數，是基於與預測變數有關它的數學方程的值的概率。
邏輯回歸一般的數學公式是：
y = 1/(1+e^-(a+b1x1+b2x2+b3x3+...))

以下是所使用的參數的說明：
y 是響應變數。
x 是預測變數。
a 和 b 是數字常量系數。
用於創建回歸模型的功能是 glm()函數。
語法
glm()函數在邏輯回歸的基本語法是：
glm(formula,data,family)

以下是所使用的參數的說明：
formula 是呈現所述變數之間的關系的標志。
data 在數據集給出這些變數的值.
family 為R對象以指定模型的細節。它的值是二項分布

6. 如何用r軟體對給定數據進行回歸分析（不能用lm函數）

可以試著探索一下summary(lm(y~x))到底是什麼。首先看一下summary(lm(y~x))是什麼數據類型： > m class(summary(m)) [1] "summary.lm" #可以看到，lm的結果是一個"summary.lm" 對象。這有些顯而易見。好吧，繼續探索。 R語言中所有的對象都建立在一些native data structures之上，那麼summary(lm(y~x)的native data structure是什麼呢？可以用mode()命令查看。

閱讀全文

與r語言股票數據回歸分析案例相關的資料

熱點內容

工商銀行卡可以綁定兩個股票賬戶么發布：2025-07-04 12:17:00 瀏覽：744

股票軟體如何顯示多窗口發布：2025-07-04 11:49:00 瀏覽：627

股票投資降低購買力的損失發布：2025-07-04 11:47:20 瀏覽：507

農發種業股票最新行情怎麼樣發布：2025-07-04 11:33:25 瀏覽：962

股票再次發行的條件發布：2025-07-04 11:26:50 瀏覽：735

南京微創醫學科技股份有限公司股票代碼發布：2025-07-04 11:25:04 瀏覽：879

股票交易實時數據獲取發布：2025-07-04 11:24:24 瀏覽：215

股票證券轉銀行要費用發布：2025-07-04 11:01:02 瀏覽：823

股票盈利5個點可以賣么發布：2025-07-04 10:53:49 瀏覽：722

同花順如何下載股票歷史數據發布：2025-07-04 10:53:47 瀏覽：239

暗網交易股票數據發布：2025-07-04 10:43:29 瀏覽：271

wear股票軟體發布：2025-07-04 10:43:22 瀏覽：617

股票中的波段王指標公式發布：2025-07-04 10:31:40 瀏覽：417

退市股票確權後股數減少了發布：2025-07-04 10:30:02 瀏覽：922

HK快手股票行情走勢發布：2025-07-04 10:12:12 瀏覽：928

st中商股票股吧發布：2025-07-04 09:56:22 瀏覽：247

中國人壽投資的股票發布：2025-07-04 09:55:15 瀏覽：352

山煤國際股票明天走勢如何發布：2025-07-04 09:54:29 瀏覽：75

股票其他貨幣資金發布：2025-07-04 08:58:56 瀏覽：89

買了st凱迪股票的人怎麼辦發布：2025-07-04 08:58:55 瀏覽：489