我使用下面的代码将pandas数据框转换为R:
import pandas as pd
import pandas.rpy.common as com
import rpy2.robjects as ro
from rpy2.robjects.packages import importr
rdf = com.convert_to_r_dataframe(df)
我如何将rdf
转换回pandas.DataFrame
?
df = f(rdf)
rpy2
和pandas
之间转换已作为可选模块包含在内。使用它,无需显式转换,将会实时完成。from rpy2.robjects import pandas2ri
pandas2ri.activate()
pandas2ri.py2ri()
和pandas2ri.ri2py()
(它们曾经是pandas2ri.pandas2ri()
和pandas2ri.ri2pandas()
)。import rpy2.robjects as ro
dt = pd.DataFrame()
# To R DataFrame
r_dt = ro.conversion.py2rpy(dt)
# To pandas DataFrame
pd_dt = ro.conversion.rpy2py(r_dt)
更多细节请查看此链接。
正如lgautier所建议的那样,可以使用pandas2ri
来完成。
以下是将rpy数据框(rdf
)转换为pandas数据帧(pd_df
)的示例代码:
from rpy2.robjects import pandas2ri
pd_df = pandas2ri.ri2py_dataframe(rdf)
com.convert_robj(rdf)
In [480]: dfrm
Out[480]:
A B C
0 0.454459 49.916767 1
1 0.943284 50.878174 1
2 0.974856 50.335679 2
3 0.776600 50.782104 1
4 0.553895 50.084505 1
5 0.514018 50.719019 2
6 0.915413 50.513962 0
7 0.771571 49.859855 2
8 0.068619 49.409657 0
9 0.728141 50.945174 2
10 0.388115 47.879653 1
11 0.960172 49.680258 0
12 0.015216 50.067968 0
13 0.495024 50.286287 1
14 0.565954 49.909771 1
15 0.992279 49.009696 1
16 0.179934 49.554256 0
17 0.521243 47.854791 0
18 0.551241 51.076262 1
19 0.713271 49.418503 0
20 0.801716 50.660304 1
In [481]: rdfrm = com.convert_to_r_dataframe(dfrm)
In [482]: rdfrm
Out[482]:
<DataFrame - Python:0x14905cf8 / R:0x1600ee98>
[FloatVector, FloatVector, IntVector]
A: <class 'rpy2.robjects.vectors.FloatVector'>
<FloatVector - Python:0xf9d0b00 / R:0x140e2620>
[0.454459, 0.943284, 0.974856, ..., 0.551241, 0.713271, 0.801716]
B: <class 'rpy2.robjects.vectors.FloatVector'>
<FloatVector - Python:0xf9d0878 / R:0x125aa240>
[49.916767, 50.878174, 50.335679, ..., 51.076262, 49.418503, 50.660304]
C: <class 'rpy2.robjects.vectors.IntVector'>
<IntVector - Python:0x11fceef0 / R:0x13f0d918>
[ 1, 1, 2, ..., 1, 0, 1]
In [483]: com.convert_robj(rdfrm)
Out[483]:
A B C
0 0.454459 49.916767 1
1 0.943284 50.878174 1
2 0.974856 50.335679 2
3 0.776600 50.782104 1
4 0.553895 50.084505 1
5 0.514018 50.719019 2
6 0.915413 50.513962 0
7 0.771571 49.859855 2
8 0.068619 49.409657 0
9 0.728141 50.945174 2
10 0.388115 47.879653 1
11 0.960172 49.680258 0
12 0.015216 50.067968 0
13 0.495024 50.286287 1
14 0.565954 49.909771 1
15 0.992279 49.009696 1
16 0.179934 49.554256 0
17 0.521243 47.854791 0
18 0.551241 51.076262 1
19 0.713271 49.418503 0
20 0.801716 50.660304 1
使用文档:
In [475]: com.convert_robj?
Type: function
String Form:<function convert_robj at 0x13e85848>
File: /mnt/epd/7.3-2_pandas0.12/lib/python2.7/site-packages/pandas/rpy/common.py
Definition: com.convert_robj(obj, use_pandas=True)
Docstring:
Convert rpy2 object to a pandas-friendly form
Parameters
----------
obj : rpy2 object
Returns
-------
Non-rpy data structure, mix of NumPy and pandas objects
r_df
的rpy2数据框,这将避免弃用警告"FutureWarning: from_items is deprecated. Use DataFrame.from_dict(dict(items), ...) instead"。
r_df
的类型是"rpy2.robjects.vectors.DataFrame"。
pd_df
的类型是"pandas.core.frame.DataFrame"。pd_df = pd.DataFrame.from_dict({ key : np.asarray(r_df.rx2(key)) for key in r_df.names })
其他解决方案似乎已经过时,对我不再起作用。
根据文档,这是当前将数据从/转换为 pandas/R 对象的方法。
import rpy2.robjects as ro
from rpy2.robjects import pandas2ri
从pandas到R:
with ro.default_converter + pandas2ri.converter:
r_from_pd_df = ro.conversion.get_conversion().py2rpy(pd_df)
r_from_pd_df
从 R 到 pandas:
with ro.default_converter + pandas2ri.converter:
pd_from_r_df = ro.conversion.get_conversion().rpy2py(r_df)
pd_from_r_df
这个只在 rpy2 版本 >=3.5.7
中起作用。
AttributeError: __enter__
。 - Alex Vorobiev
pandas.rpy
已于 pandas 0.20 中被移除。 - Franck Dernoncourt