使用LLVM JIT调用Python代码

Question

使用LLVM JIT调用Python代码

6

我使用Python编写了一个语言词法分析器/解析器/编译器，将来应该在LLVM JIT-VM上运行（使用

 llvm-py ）。目前前两个步骤相当直接，但是（即使我还没有开始编译任务），我看到一个问题，当我的代码想要调用Python代码（通常情况下）或者特别地与Python词法分析器/解析器/编译器进行交互时。我的主要担心是，代码应该能够在运行时动态加载其他代码到VM中，因此必须从VM内部触发整个Python词法分析器/解析器/编译器链。

首先：这是否可能，或者一旦启动，VM就是“不可变”的？

如果是这样，我目前看到3种可能的解决方案（我开放其他建议）


“打破”VM并使其直接调用主进程的Python函数成为可能（也许通过将其注册为LLVM函数，以某种方式重定向到主进程）。我没有找到任何相关信息，无论如何，我不确定这是否是一个好主意（安全等方面）。
将运行时（静态或在运行时动态）编译为LLVM汇编/IR。这要求IR代码能够修改它所运行的VM
将运行时（静态地）编译为库并直接加载到VM中。同样，它必须能够向其运行的VM添加功能（等）。

- KingCrunch

2个回答

2

你可以从LLVM JIT生成的代码中调用外部C函数。还需要什么？

这些外部函数将在执行过程中被找到，这意味着如果你将Python链接到你的虚拟机中，你可以调用Python的C API函数。

“VM”可能比你想象的不那么神奇 :-) 最终，它只是在运行时发出的机器代码，存储在缓冲区中并从那里执行。在这个代码具有访问该进程中的其他符号的能力的范围内，它可以做任何该进程中的其他代码可以做的事情。

- Eli Bendersky

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Stephen Diehl · Accepted Answer

就像Eli所说的那样，你完全可以调用Python C-API。当你从LLVM JIT内部调用外部函数时，它实际上只是在进程空间中使用了dlopen()，因此如果你正在从llvmpy内部运行，则已经可以访问所有Python解释器符号，甚至可以与调用ExecutionEngine的活动解释器交互，或者如果需要的话可以启动一个新的Python解释器。

为了让你开始，创建一个带有我们评估器的新C文件。

#include <Python.h>

void python_eval(const char* s)
{
    PyCodeObject* code = (PyCodeObject*) Py_CompileString(s, "example", Py_file_input);

    PyObject* main_module = PyImport_AddModule("__main__");
    PyObject* global_dict = PyModule_GetDict(main_module);
    PyObject* local_dict = PyDict_New();
    PyObject* obj = PyEval_EvalCode(code, global_dict, local_dict);

    PyObject* result = PyObject_Str(obj);

    // Print the result if you want.
    // PyObject_Print(result, stdout, 0);
}

以下是一个小的Makefile文件，用于编译：

CC = gcc
LPYTHON = $(shell python-config --includes)
CFLAGS = -shared -fPIC -lpthread $(LPYTHON)

.PHONY: all clean

all:
    $(CC) $(CFLAGS) cbits.c -o cbits.so

clean:
    -rm cbits.c

我们从LLVM的常规样板开始，但使用ctypes来将我们的cbits.so共享库的共享对象加载到全局进程空间中，以便我们拥有python_eval符号。然后只需创建一个简单的LLVM模块并分配一个带有一些Python源代码的字符串，并使用ctypes将指针传递给运行JIT函数的ExecutionEngine，该函数来自我们的模块，这反过来将Python源代码传递给调用Python C-API的C函数，然后返回给LLVM JIT。

import llvm.core as lc
import llvm.ee as le

import ctypes
import inspect

ctypes._dlopen('./cbits.so', ctypes.RTLD_GLOBAL)

pointer = lc.Type.pointer

i32 = lc.Type.int(32)
i64 = lc.Type.int(64)

char_type  = lc.Type.int(8)
string_type = pointer(char_type)

zero = lc.Constant.int(i64, 0)

def build():
    mod = lc.Module.new('call python')
    evalfn = lc.Function.new(mod,
        lc.Type.function(lc.Type.void(),
        [string_type], False), "python_eval")

    funty = lc.Type.function(lc.Type.void(), [string_type])

    fn = lc.Function.new(mod, funty, "call")
    fn_arg0 = fn.args[0]
    fn_arg0.name = "input"

    block = fn.append_basic_block("entry")
    builder = lc.Builder.new(block)

    builder.call(evalfn, [fn_arg0])
    builder.ret_void()

    return fn, mod

def run(fn, mod, buf):

    tm = le.TargetMachine.new(features='', cm=le.CM_JITDEFAULT)
    eb = le.EngineBuilder.new(mod)
    engine = eb.create(tm)

    ptr = ctypes.cast(buf, ctypes.c_voidp)
    ax = le.GenericValue.pointer(ptr.value)

    print 'IR'.center(80, '=')
    print mod

    mod.verify()
    print 'Assembly'.center(80, '=')
    print mod.to_native_assembly()

    print 'Result'.center(80, '=')
    engine.run_function(fn, [ax])

if __name__ == '__main__':
    # If you want to evaluate the source of an existing function
    # source_str = inspect.getsource(mypyfn)

    # If you want to pass a source string
    source_str = "print 'Hello from Python C-API inside of LLVM!'"

    buf = ctypes.create_string_buffer(source_str)
    fn, mod = build()
    run(fn, mod, buf)

您需要输出以下内容：

=======================================IR=======================================
; ModuleID = 'call python'

declare void @python_eval(i8*)

define void @call(i8* %input) {
entry:
  call void @python_eval(i8* %input)
  ret void
}
=====================================Result=====================================
Hello from Python C-API inside of LLVM!