Resampler
方法
min
返回重新采样后的最小值。
dataframe.Resampler.min(skipna=True, numeric_only=False) #包含参数"skipna",代表是否忽略缺失值,默认为"True";参数"numeric_only",代表是否只计算数值列,默认为"False"。
#示范
d1 = DataFrame({
"timeline": [
Timestamp('2013-01-01 09:00:00'),
Timestamp('2013-01-01 09:00:01'),
Timestamp('2013-01-01 09:00:02'),
Timestamp('2013-01-01 09:00:03'),
Timestamp('2013-01-01 09:00:04'),
Timestamp('2013-01-01 09:00:05'),
Timestamp('2013-01-01 09:00:06'),
Timestamp('2013-01-01 09:00:07'),
Timestamp('2013-01-01 09:00:08'),
Timestamp('2013-01-01 09:00:09'),
],
"price": [
4, 6, 1, 4, 7, 1, 5, 8, 2, 9,
]
})
r1 = d1[["price"]].resample('5s', timeline=d1["timeline"])
return r1.min() #以5秒为采样周期,返回各个采样周期内的最小值
max
返回重新采样后的最大值。
dataframe.Resampler.max(skipna=True, numeric_only=False)
#示范
d1 = DataFrame({
"timeline": [
Timestamp('2013-01-01 09:00:00'),
Timestamp('2013-01-01 09:00:01'),
Timestamp('2013-01-01 09:00:02'),
Timestamp('2013-01-01 09:00:03'),
Timestamp('2013-01-01 09:00:04'),
Timestamp('2013-01-01 09:00:05'),
Timestamp('2013-01-01 09:00:06'),
Timestamp('2013-01-01 09:00:07'),
Timestamp('2013-01-01 09:00:08'),
Timestamp('2013-01-01 09:00:09'),
],
"price": [
4, 6, 1, 4, 7, 1, 5, 8, 2, 9,
]
})
r1 = d1[["price"]].resample('5s', timeline=d1["timeline"])
return r1.max()
sum
返回重新采样后的总和。
dataframe.Resampler.sum(skipna=True, numeric_only=False)
#示范
d1 = DataFrame({
"timeline": [
Timestamp('2013-01-01 09:00:00'),
Timestamp('2013-01-01 09:00:01'),
Timestamp('2013-01-01 09:00:02'),
Timestamp('2013-01-01 09:00:03'),
Timestamp('2013-01-01 09:00:04'),
Timestamp('2013-01-01 09:00:05'),
Timestamp('2013-01-01 09:00:06'),
Timestamp('2013-01-01 09:00:07'),
Timestamp('2013-01-01 09:00:08'),
Timestamp('2013-01-01 09:00:09'),
],
"price": [
4, 6, 1, 4, 7, 1, 5, 8, 2, 9,
]
})
r1 = d1[["price"]].resample('5s', timeline=d1["timeline"])
return r1.sum()
std
返回重新采样后的标准差。
dataframe.Resampler.std(skipna=True, numeric_only=False, ddof=n) #参数"ddof",表示用于标准差计算的自由度调整(Delta Degrees of Freedom)。n 代表自由度调整数值,具体指从样本数量中减去的数值。默认情况下,ddof=1,表示计算样本标准差。如果需要计算总体标准差,可以将ddof设置为0。
#示范
d1 = DataFrame({
"timeline": [
Timestamp('2013-01-01 09:00:00'),
Timestamp('2013-01-01 09:00:01'),
Timestamp('2013-01-01 09:00:02'),
Timestamp('2013-01-01 09:00:03'),
Timestamp('2013-01-01 09:00:04'),
Timestamp('2013-01-01 09:00:05'),
Timestamp('2013-01-01 09:00:06'),
Timestamp('2013-01-01 09:00:07'),
Timestamp('2013-01-01 09:00:08'),
Timestamp('2013-01-01 09:00:09'),
],
"price": [
4, 6, 1, 4, 7, 1, 5, 8, 2, 9,
]
})
r1 = d1[["price"]].resample('5s', timeline=d1["timeline"])
return r1.std()
var
返回重新采样后的方差。
dataframe.Resampler.var(skipna=True, numeric_only=False, ddof=n)
#示范
d1 = DataFrame({
"timeline": [
Timestamp('2013-01-01 09:00:00'),
Timestamp('2013-01-01 09:00:01'),
Timestamp('2013-01-01 09:00:02'),
Timestamp('2013-01-01 09:00:03'),
Timestamp('2013-01-01 09:00:04'),
Timestamp('2013-01-01 09:00:05'),
Timestamp('2013-01-01 09:00:06'),
Timestamp('2013-01-01 09:00:07'),
Timestamp('2013-01-01 09:00:08'),
Timestamp('2013-01-01 09:00:09'),
],
"price": [
4, 6, 1, 4, 7, 1, 5, 8, 2, 9,
]
})
r1 = d1[["price"]].resample('5s', timeline=d1["timeline"])
return r1.var()
mean
返回重新采样后的平均值。
dataframe.Resampler.mean(skipna=True, numeric_only=False)
#示范
d1 = DataFrame({
"timeline": [
Timestamp('2013-01-01 09:00:00'),
Timestamp('2013-01-01 09:00:01'),
Timestamp('2013-01-01 09:00:02'),
Timestamp('2013-01-01 09:00:03'),
Timestamp('2013-01-01 09:00:04'),
Timestamp('2013-01-01 09:00:05'),
Timestamp('2013-01-01 09:00:06'),
Timestamp('2013-01-01 09:00:07'),
Timestamp('2013-01-01 09:00:08'),
Timestamp('2013-01-01 09:00:09'),
],
"price": [
4, 6, 1, 4, 7, 1, 5, 8, 2, 9,
]
})
r1 = d1[["price"]].resample('5s', timeline=d1["timeline"])
return r1.mean()
apply
将函数应用到重新采样后的数据。
dataframe.Resampler.apply()
#示范
d1 = DataFrame({
"timeline": [
Timestamp('2013-01-01 09:00:00'),
Timestamp('2013-01-01 09:00:01'),
Timestamp('2013-01-01 09:00:02'),
Timestamp('2013-01-01 09:00:03'),
Timestamp('2013-01-01 09:00:04'),
Timestamp('2013-01-01 09:00:05'),
Timestamp('2013-01-01 09:00:06'),
Timestamp('2013-01-01 09:00:07'),
Timestamp('2013-01-01 09:00:08'),
Timestamp('2013-01-01 09:00:09'),
],
"price": [
4, 6, 1, 4, 7, 1, 5, 8, 2, 9,
]
})
r1 = d1[["price"]].resample('5s', timeline=d1["timeline"])
def cal_mean(x):
return x["price"].mean()
return r1.apply(cal_mean)