Thrift有多大程度上是线程安全的?关于我似乎有请求相互干扰的问题。

9

编辑

显然,我所希望做的事情超出了thrift的范围...如果我确保端口上永远不会有多个客户端,一切都是没问题的。当然,这有点违背了初衷,因为我想打开几个可重用的连接到服务器,以提高响应时间和降低开销。

如果有人有一个替代方法来实现这一点,那将不胜感激(或者如果我的结论是错误的)。

背景

我有一个多组件应用程序,大部分都是通过thrift连接起来的(主要是java->php连接)。

到目前为止,一切都看起来很好,但是当我引入一个Java->Java连接时,客户端端点是一个可以启动数百个请求/秒的servlet。

被访问的方法具有以下接口:

bool pvCheck(1:i32 toolId) throws(1:DPNoToolException nte),

为了确保不是服务端出现问题,我用一个简单的实现替换了它:
    @Override
    public boolean pvCheck(int toolId) throws TException {
        //boolean ret = api.getViewsAndDec(toolId);
        return true;
    }

错误/可能的原因?

只要连接不多,一切都正常,但是当连接越来越接近时,连接开始在读取器中卡住。

如果我在调试器中查看其中一个连接,堆栈如下:

Daemon Thread [http-8080-197] (Suspended)   
    BufferedInputStream.read(byte[], int, int) line: 308    
    TSocket(TIOStreamTransport).read(byte[], int, int) line: 126    
    TSocket(TTransport).readAll(byte[], int, int) line: 84  
    TBinaryProtocol.readAll(byte[], int, int) line: 314 
    TBinaryProtocol.readI32() line: 262 
    TBinaryProtocol.readMessageBegin() line: 192    
    DumboPayment$Client.recv_pvCheck() line: 120    
    DumboPayment$Client.pvCheck(int) line: 105  
    Receiver.performTask(HttpServletRequest, HttpServletResponse) line: 157 
    Receiver.doGet(HttpServletRequest, HttpServletResponse) line: 109   
    Receiver(HttpServlet).service(HttpServletRequest, HttpServletResponse) line: 617    
    Receiver(HttpServlet).service(ServletRequest, ServletResponse) line: 717    
    ApplicationFilterChain.internalDoFilter(ServletRequest, ServletResponse) line: 290  
    ApplicationFilterChain.doFilter(ServletRequest, ServletResponse) line: 206  
    StandardWrapperValve.invoke(Request, Response) line: 233    
    StandardContextValve.invoke(Request, Response) line: 191    
    StandardHostValve.invoke(Request, Response) line: 127   
    ErrorReportValve.invoke(Request, Response) line: 102    
    StandardEngineValve.invoke(Request, Response) line: 109 
    CoyoteAdapter.service(Request, Response) line: 298  
    Http11AprProcessor.process(long) line: 859  
    Http11AprProtocol$Http11ConnectionHandler.process(long) line: 579   
    AprEndpoint$Worker.run() line: 1555 
    Thread.run() line: 619  

这似乎是由于数据损坏而触发的,因为我遇到了以下异常:

10/11/22 18:38:55 WARN logger.Receiver: pvCheck had an exception
org.apache.thrift.TApplicationException: pvCheck failed: unknown result
    at *.thrift.generated.DumboPayment$Client.recv_pvCheck(DumboPayment.java:135)
    at *.thrift.generated.DumboPayment$Client.pvCheck(DumboPayment.java:105)
    at *.Receiver.performTask(Receiver.java:157)
    at *.Receiver.doGet(Receiver.java:109)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:617)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
    at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
    at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
    at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
    at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
    at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
    at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298)
    at org.apache.coyote.http11.Http11AprProcessor.process(Http11AprProcessor.java:859)
    at org.apache.coyote.http11.Http11AprProtocol$Http11ConnectionHandler.process(Http11AprProtocol.java:579)
    at org.apache.tomcat.util.net.AprEndpoint$Worker.run(AprEndpoint.java:1555)
    at java.lang.Thread.run(Thread.java:619)

并且

10/11/22 17:59:46 ERROR [/ninja_ar].[Receiver]: サーブレット Receiver のServlet.service()が例外を投げました
java.lang.OutOfMemoryError: Java heap space
    at org.apache.thrift.protocol.TBinaryProtocol.readStringBody(TBinaryProtocol.java:296)
    at org.apache.thrift.protocol.TBinaryProtocol.readString(TBinaryProtocol.java:290)
    at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:198)
    at *.thrift.generated.DumboPayment$Client.recv_pvCheck(DumboPayment.java:120)
    at *.thrift.generated.DumboPayment$Client.pvCheck(DumboPayment.java:105)
    at *.Receiver.performTask(Receiver.java:157)
    at *.Receiver.doGet(Receiver.java:109)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:690)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:803)
    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:269)
    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188)
    at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:210)
    at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:172)
    at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
    at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:117)
    at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:108)
    at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:151)
    at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:870)
    at org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(Http11BaseProtocol.java:665)
    at org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpoint.java:528)
    at org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(LeaderFollowerWorkerThread.java:81)
    at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:685)
    at java.lang.Thread.run(Thread.java:636)

也许我有些离谱,但我相当确定这与客户端在没有发送任何内容时继续尝试读取有关。
一些实现细节:
服务器和客户端都使用Java二进制协议。
我编写了一个简单的客户端池类,它使我可以重用客户端,这些是主要功能:
public synchronized Client getClient() {
    if(clientQueue.isEmpty()) {
        return newClient();
    } else {
        return clientQueue.getLast();
    }
}

private synchronized Client newClient() {
    int leftToTry = serverArr.length;
    Client cli = null;
    while(leftToTry > 0 && cli == null) {
        log.info("Creating new connection to " + 
                serverArr[roundRobinPos] + port);
        TTransport transport = new TSocket(serverArr[roundRobinPos], port);
        TProtocol protocol = new TBinaryProtocol(transport);
        cli = new Client(protocol);
        try {
            transport.open();
        } catch (TTransportException e) {
            cli = null;
            log.warn("Failed connection to " + 
                    serverArr[roundRobinPos] + port);
        }

        roundRobinPos++;
        if(roundRobinPos >= serverArr.length) {
            roundRobinPos = 0;
        }
        leftToTry--;
    }

    return cli;
}

public void returnClient(Client cli) {
    clientQueue.addFirst(cli);
}

客户端应用程序(即Tomcat Servlet)以下列方式访问它:
    Client dpayClient = null;
    if(dpay != null
            && (dpayClient = dpay.getClient()) != null) {

        try {
            dpayClient.pvCheck(requestParameters.getId());
        } catch (DPNoToolException e) {
            return;
        } catch (TException e) {
            log.warn("pvCheck had an exception", e);
        } finally {
            if(dpayClient != null) {
                dpay.returnClient(dpayClient);
            }
        }
    }

实际的Thrift连接是通过以下方式提升的。
private boolean initThrift(int port, Configuration conf) {
    TProtocolFactory protocolFactory = new TBinaryProtocol.Factory();
    DPaymentHandler handler = new DPaymentHandler(conf);

    DumboPayment.Processor processor = 
        new DumboPayment.Processor(handler);

    InetAddress listenAddress;
    try {
        listenAddress = InetAddress.getLocalHost();
    } catch (UnknownHostException e) {
        LOG.error("Failed in thrift init", e);
        return false;
    }
    TServerTransport serverTransport;
    try {
        serverTransport = new TServerSocket(
                new InetSocketAddress(listenAddress, port));
    } catch (TTransportException e) {
        LOG.error("Failed in thrift init", e);
        return false;
    }

    TTransportFactory transportFactory = new TTransportFactory();
    TServer server = new TThreadPoolServer(processor, serverTransport,
            transportFactory, protocolFactory);

    LOG.info("Starting Dumbo Payment thrift server on " + 
            listenAddress + ":" + Integer.toString(port));
    server.serve();

    return true;
}

最后

我已经卡在这个问题上有一段时间了……可能我错过了一些明显的东西。希望能得到任何帮助。

如果需要任何额外的信息,请让我知道。这里有很多话,所以我想尽量只保留最相关的内容。

3个回答

11
我的猜测是您有多个线程同时尝试使用客户端,我不确定这是否完全可靠。您可以尝试使用异步接口以及构建线程安全的资源池来访问您的客户端。
使用Thrift-0.5.0.0,以下是创建您的thrift生成代码的AsyncClient的示例:
Factory fac = new AsyncClient.Factory(new TAsyncClientManager(), new TProtocolFactory() {
    @Override
    public TProtocol getProtocol( TTransport trans ) {
        return new TBinaryProtocol(trans);
    }
});
AsyncClient cl = fac.getAsyncClient( new TNonblockingSocket( "127.0.0.1", 12345 ));

然而,如果你查看源代码,你会注意到它只有一个单线程消息处理程序,即使它使用了NIO套接字,你可能会发现这是一个瓶颈。为了获得更多,你需要创建更多的异步客户端,检查它们,并正确返回它们。
为了简化此过程,我制作了一个小型类来管理它们。你唯一需要做的就是包含你的软件包,然后它应该适合你的需求,尽管我没有真正测试它(实际上,根本没有测试):
public class Thrift {

    // This is the request
    private static abstract class ThriftRequest {

        private void go( final Thrift thrift, final AsyncClient cli ) {
            on( cli );
            thrift.ret( cli );
        }

        public abstract void on( AsyncClient cli );
    }

    // Holds all of our Async Clients
    private final ConcurrentLinkedQueue<AsyncClient>   instances = new ConcurrentLinkedQueue<AsyncClient>();
    // Holds all of our postponed requests
    private final ConcurrentLinkedQueue<ThriftRequest> requests  = new ConcurrentLinkedQueue<ThriftRequest>();
    // Holds our executor, if any
    private Executor                                 exe       = null;

    /**
     * This factory runs in thread bounce mode, meaning that if you call it from 
     * many threads, execution bounces between calling threads depending on when        
     * execution is needed.
     */
    public Thrift(
            final int clients,
            final int clients_per_message_processing_thread,
            final String host,
            final int port ) throws IOException {

        // We only need one protocol factory
        TProtocolFactory proto_fac = new TProtocolFactory() {

            @Override
            public TProtocol getProtocol( final TTransport trans ) {
                return new TBinaryProtocol( trans );
            }
        };

        // Create our clients
        Factory fac = null;
        for ( int i = 0; i < clients; i++ ) {

            if ( fac == null || i % clients_per_message_processing_thread == 0 ) {
                fac = new AsyncClient.Factory(
                    new TAsyncClientManager(),
                    proto_fac );
            }

            instances.add( fac.getAsyncClient( new TNonblockingSocket(
                host,
                port ) ) );
        }
    }
    /**
     * This factory runs callbacks in whatever mode the executor is setup for,
     * not on calling threads.
     */
    public Thrift( Executor exe,
            final int clients,
            final int clients_per_message_processing_thread,
            final String host,
            final int port ) throws IOException {
        this( clients, clients_per_message_processing_thread, host, port );
        this.exe = exe;
    }

    // Call this to grab an instance
    public void
            req( final ThriftRequest req ) {
        final AsyncClient cli;
        synchronized ( instances ) {
            cli = instances.poll();
        }
        if ( cli != null ) {
            if ( exe != null ) {
                // Executor mode
                exe.execute( new Runnable() {

                    @Override
                    public void run() {
                        req.go( Thrift.this, cli );
                    }

                } );
            } else {
                // Thread bounce mode
                req.go( this, cli );
            }
            return;
        }
        // No clients immediately available
        requests.add( req );
    }

    private void ret( final AsyncClient cli ) {
        final ThriftRequest req;
        synchronized ( requests ) {
            req = requests.poll();
        }
        if ( req != null ) {
            if ( exe != null ) {
                // Executor mode
                exe.execute( new Runnable() {

                    @Override
                    public void run() {
                        req.go( Thrift.this, cli );
                    }
                } );
            } else {
                // Thread bounce mode
                req.go( this, cli );
            }
            return;
        }
        // We did not need this immediately, hold onto it
        instances.add( cli );

    }

}

如何使用它的示例:
// Make the pool
Thrift t = new Thrift( 10, "localhost", 8000 );
// Use the pool
t.req( new ThriftRequest() {

    @Override
    public void on( AsyncClient cli ) {
        cli.MyThriftMethod( "stringarg", 111, new AsyncMethodCallback<AsyncClient.MyThriftMethod_call>() {
            @Override
            public void onError( Throwable throwable ) {
                }

            @Override
            public void onComplete( MyThriftMethod_call response ) {
            }
        });
    }
} );

您可能想尝试不同的服务器模式,比如THsHaServer,以找出最适合您环境的方案。


感谢您提供详细的答案和代码!但我相信问题不在于同时使用客户端?所有客户端都由池分配,并且访问器函数仅返回未使用的客户端,一切都同步。自发布问题以来,我已将所有客户端分离到不同的端口上运行,限制它们的数量并且似乎已经解决了问题...所以现在我在每个服务器上运行4个Thrift服务器实例,并且在启动时,每个客户端最多只创建一个客户端端口。因此,我猜测流量被混合了。 - juhanic
此外,请注意,“org.apache.thrift.TApplicationException: pvCheck failed: unknown result”通常发生在传回意外的空结果时。这可能是由于服务器上的OutOfMemory错误引起的。启动4个实例可能会将您的内存增加四倍,并减少获取导致空传递的OOM异常的机会。 - Nthalk
那很有道理,我会更深入地研究一下,尽管内存使用一直很低... - juhanic

0

尝试在以下行中使用THttpClient而不是TSocket:

TTransport transport = new TSocket(serverArr[roundRobinPos], port);

-1

ClientPool的returnClient函数不是线程安全的:

public void returnClient(Client cli) {
    clientQueue.addFirst(cli);
}

如果你不解释为什么它不是线程安全的话,这个答案几乎没有什么用处。 - David Hoelzer

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接