如何计算 pandas pd.Timestamp 的平均值
问题
如果你有 pd.Timestamp 对象数组,你不能直接计算平均值,因为它们不能直接相加:
average_timestamp_problem.py
import pandas as pd
# Creating an array of five fixed pd.Timestamp objects
timestamps = [
pd.Timestamp('2023-01-01 12:00:00'),
pd.Timestamp('2023-01-02 12:00:00'),
pd.Timestamp('2023-01-03 12:00:00'),
pd.Timestamp('2023-01-04 12:00:00'),
pd.Timestamp('2023-01-05 12:00:00')
]
# FAIL: This will raise a TypeError
average = sum(timestamps) / len(timestamps)这将引发 TypeError:
error.txt
TypeError Traceback (most recent call last)
Cell In[1], line 13
4 timestamps = [
5 pd.Timestamp('2023-01-01 12:00:00'),
6 pd.Timestamp('2023-01-02 12:00:00'),
(...)
9 pd.Timestamp('2023-01-05 12:00:00')
10 ]
12 # FAIL: This will raise a TypeError
---> 13 average = sum(timestamps) / len(timestamps)
File timestamps.pyx:483, in pandas._libs.tslibs.timestamps._Timestamp.__radd__()
File timestamps.pyx:465, in pandas._libs.tslibs.timestamps._Timestamp.__add__()
TypeError: Addition/subtraction of integers and integer-arrays with Timestamp is no longer supported. Instead of adding/subtracting `n`, use `n * obj.freq`解决方案
你可以对 ts.value 求和/取平均值,而不是直接对 ts 求和,然后在取平均值后将其转换回时间戳:
average_timestamp_solution.py
average = pd.Timestamp(sum(ts.value for ts in timestamps) / len(timestamps))完整示例:
average_timestamp_full_example.py
import pandas as pd
# Creating an array of five fixed pd.Timestamp objects
timestamps = [
pd.Timestamp('2023-01-01 12:00:00'),
pd.Timestamp('2023-01-02 12:00:00'),
pd.Timestamp('2023-01-03 12:00:00'),
pd.Timestamp('2023-01-04 12:00:00'),
pd.Timestamp('2023-01-05 12:00:00')
]
# Result: Timestamp('2023-01-03 12:00:00')
average = pd.Timestamp(sum(ts.value for ts in timestamps) / len(timestamps))If this post helped you, please consider buying me a coffee or donating via PayPal to support research & publishing of new posts on TechOverflow