我有一个Dataframe,其中特定列中有一些NaN值(Dataframe的样子如下,顺便说一句,实际上Dataframe比我下面展示的要大得多):
source battery Temperature time Distance
0 83512 98.0 NaN 2019-10-26T00:00:06.494Z NaN
1 83512 NaN 23.0 2019-10-26T00:00:06.538Z NaN
2 83512 NaN NaN 2019-10-26T00:00:06.577Z 21.0
3 83512 98.0 NaN 2019-10-26T00:30:06.702Z NaN
4 83512 NaN 23.0 2019-10-26T00:30:06.743Z NaN
5 83512 NaN NaN 2019-10-26T00:30:06.781Z 21.0
6 83512 98.0 NaN 2019-10-26T01:00:08.955Z NaN
7 83512 NaN 23.0 2019-10-26T01:00:08.998Z NaN
8 83512 NaN NaN 2019-10-26T01:00:09.039Z 21.0
我正在寻找一种方法来缩小框架,使其看起来更像这样:
source battery Temperature time Distance
0 83512 98.0 23.0 2019-10-26T00:00:06.494Z 21.0
1 83512 98.0 23.0 2019-10-26T00:30:06.702Z 21.0
2 83512 98.0 23.0 2019-10-26T01:00:08.955Z 21.0
换言之,我正在尝试从电池温度和距离列中删除NaN值,如果时间读数几乎相似(例如,时间=
2019-10-26T00:00:06.494Z, 2019-10-26T00:00:06.538Z, 2019-10-26T00:00:06.577Z
),获取所有对应的值(源,电池,温度,时间和距离)。这是我目前为止的成果。enter code here
from pandas.io.json import json_normalize
import json
import pandas as pd
import requests
URL = 'https://xxxxx.com'
req = requests.get(URL,auth=('xxx', 'xxx') )
text_data= req.text
json_dict= json.loads(text_data)
df= json_normalize(json_dict['measurements'])
df = df.rename(columns={'source.id': 'source', 'battery.percent.value': 'battery', 'c8y_TemperatureMeasurement.T.value': 'Temperature Or T','c8y_DistanceMeasurement.distance.value':'Distance'})
cols_to_keep =['source' ,'battery', 'Temperature Or T', 'time', 'Distance']
df_final = df[cols_to_keep]
# this line doesnt give me the expected output
df1 = df_final.apply(lambda x: pd.Series(x.dropna().values))