使用Jedis线程池returnResource异常注意事项

bangongJIAO1@c 发布于 2025-12-04 阅读(2)
目录
  • 在线上环境发现了一个工作线程异常终止
    • 见如下示例代码
    • 以及日志输出
  • 分析
    • 执行顺序如下所示
    • 执行redis的逻辑
    • 实际的日志输出为
      • 分析:
      • 解决方法:
  • 补充
    • 临时解决方法

      在线上环境发现了一个工作线程异常终止

      看日志先是一些SocketTimeoutException,然后突然有一个ClassCastException

      redis.clients.jedis.exceptions.JedisConnectionException: java.net.SocketTimeoutException: Read timed out
      ...
      java.lang.ClassCastException: [B cannot be cast to java.lang.Long
              at redis.clients.jedis.Connection.getIntegerReply(Connection.java:208)
              at redis.clients.jedis.Jedis.sismember(Jedis.java:1307)

      经过在本地人工模拟网络异常的情境,最终复现了线上的这一异常。

      又经过深入分析(提出假设-->验证假设),最终找出了导致这一问题的原因。

      见如下示例代码

      JedisPool pool = ...;
      Jedis jedis = pool.getResource();
      String value = jedis.get("foo");
      System.out.println("Make SocketTimeoutException");
      System.in.read(); //等待制造SocketTimeoutException
      try {
          value = jedis.get("foo");
          System.out.println(value);
      } catch (JedisConnectionException e) {
          e.printStackTrace();
      }
      System.out.println("Recover from SocketTimeoutException");
      System.in.read();  //等待恢复
      Thread.sleep(5000); // 继续休眠一段时间 等待网络完全恢复
      boolean isMember = jedis.sismember("urls", "baidu.com");

      以及日志输出

      bar
      Make SocketTimeoutException
      redis.clients.jedis.exceptions.JedisConnectionException: java.net.SocketTimeoutException: Read timed out
      Recover from SocketTimeoutException
          at redis.clients.util.RedisInputStream.ensureFill(RedisInputStream.java:210)
          at redis.clients.util.RedisInputStream.readByte(RedisInputStream.java:47)
          at redis.clients.jedis.Protocol.process(Protocol.java:131)
          at redis.clients.jedis.Protocol.read(Protocol.java:196)
          at redis.clients.jedis.Connection.readProtocolWithCheckingBroken(Connection.java:283)
          at redis.clients.jedis.Connection.getBinaryBulkReply(Connection.java:202)
          at redis.clients.jedis.Connection.getBulkReply(Connection.java:191)
          at redis.clients.jedis.Jedis.get(Jedis.java:101)
          at com.tcl.recipevideohunter.JedisTest.main(JedisTest.java:23)
      Caused by: java.net.SocketTimeoutException: Read timed out
          at java.net.SocketInputStream.socketRead0(Native Method)
          at java.net.SocketInputStream.read(SocketInputStream.java:152)
          at java.net.SocketInputStream.read(SocketInputStream.java:122)
          at java.net.SocketInputStream.read(SocketInputStream.java:108)
          at redis.clients.util.RedisInputStream.ensureFill(RedisInputStream.java:204)
          ... 8 more
      Exception in thread "main" java.lang.ClassCastException: [B cannot be cast to java.lang.Long
          at redis.clients.jedis.Connection.getIntegerReply(Connection.java:208)
          at redis.clients.jedis.Jedis.sismember(Jedis.java:1307)
          at com.tcl.recipevideohunter.JedisTest.main(JedisTest.java:32)

      分析

      等执行第二遍的get("foo")时,网络超时,并未实际发送 get foo 命令,等执行sismember时,网络已恢复正常,并且是同一个jedis实例,于是将之前的get foo命令(已在输出流缓存中)一并发送。

      执行顺序如下所示

      127.0.0.1:9379> get foo"bar"127.0.0.1:9379> sismember urls baidu.com(integer) 1127.0.0.1:9379> get foo
      "bar"
      127.0.0.1:9379> sismember urls baidu.com
      (integer) 1

      故在上述示例代码中最后的sismember得到的结果是get foo的结果,即一个字符串,而sismember需要的是一个Long型,故导致了ClassCastException。

      执行redis的逻辑

      为什么线上会出现这一问题呢?原因是其执行redis的逻辑类似这样:

      while(true){
              Jedis jedis = null;
          try {
              jedis = pool.getResource();
              //some redis operation here.
          } catch (Exception e) {
             logger.error(e);
          } finally {
              pool.returnResource(jedis);
          }
      }

      因若是网络异常的话,pool.returnResource(jedis)仍能成功执行,即能将其返回到池中(这时jedis并不为空)。等网络恢复后,并是多线程环境,导致后续其他某个线程获得了同一个Jedis实例(pool.getResource()),

      若该线程中的jedis操作返回类型与该jedis实例在网络异常期间第一条未执行成功的jedis操作的返回类型不匹配(如一个是get,一个是sismember),则就会出现ClassCastException异常。

      这还算幸运的,若返回的是同一类型的话(如lpop("queue_order_pay_failed"),lpop("queue_order_pay_success")),那我真不敢想象。

      如在上述示例代码中的sismember前插入一get("nonexist-key")(redis中不存在该key,即应该返回空).

      value = jedis.get("nonexist-key");
      System.out.println(value);
      boolean isMember = jedis.sismember("urls", "baidu.com");
      System.out.println(isMember);

      实际的日志输出为

      bar
      Exception in thread "main" java.lang.NullPointerException
          at redis.clients.jedis.Jedis.sismember(Jedis.java:1307)
          at com.tcl.recipevideohunter.JedisTest.main(JedisTest.java:37)

      分析:

      get("nonexist-key")得到是之前的get("foo")的结果, 而sismember得到的是get("nonexist-key")的结果,而get("nonexist-key")返回为空,于是这时是报空指针异常了.

      解决方法:

      不能不管什么情况都一律使用returnResource。更健壮可靠以及优雅的处理方式如下所示:

      while(true){
          Jedis jedis = null;
          boolean broken = false;
          try {
              jedis = jedisPool.getResource();
              return jedisAction.action(jedis); //模板方法
          } catch (JedisException e) {
              broken = handleJedisException(e);
              throw e;
          } finally {
              closeResource(jedis, broken);
          }
      }
      
      /**
       * Handle jedisException, write log and return whether the connection is broken.
       */
      protected boolean handleJedisException(JedisException jedisException) {
          if (jedisException instanceof JedisConnectionException) {
              logger.error("Redis connection " + jedisPool.getAddress() + " lost.", jedisException);
          } else if (jedisException instanceof JedisDataException) {
              if ((jedisException.getMessage() != null) && (jedisException.getMessage().indexOf("READONLY") != -1)) {
                  logger.error("Redis connection " + jedisPool.getAddress() + " are read-only slave.", jedisException);
              } else {
                  // dataException, isBroken=false
                  return false;
              }
          } else {
              logger.error("Jedis exception happen.", jedisException);
          }
          return true;
      }
      /**
       * Return jedis connection to the pool, call different return methods depends on the conectionBroken status.
       */
      protected void closeResource(Jedis jedis, boolean conectionBroken) {
          try {
              if (conectionBroken) {
                  jedisPool.returnBrokenResource(jedis);
              } else {
                  jedisPool.returnResource(jedis);
              }
          } catch (Exception e) {
              logger.error("return back jedis failed, will fore close the jedis.", e);
              JedisUtils.destroyJedis(jedis);
          }
      }

      补充

      Ubuntu本地模拟访问redis网络超时:

      sudo iptables -A INPUT -p tcp --dport 6379 -j DROP

      恢复网络:

      sudo iptables -F

      补充:

      若jedis操作逻辑类似下面所示的话,

      Jedis jedis = null;
      try {
          jedis = jedisSentinelPool.getResource();
          return jedis.get(key);
      }catch(JedisConnectionException e) {
          jedisSentinelPool.returnBrokenResource(jedis);
          logger.error("", e);
          throw e;
      }catch (Exception e) {
          logger.error("", e);
          throw e;
      }
      finally {
          jedisSentinelPool.returnResource(jedis);
      }

      若一旦发生了JedisConnectionException,如网络异常,会先执行returnBrokenResource,这时jedis已被destroy了。然后进入了finally,再一次执行returnResource,这时会报错:

      redis.clients.jedis.exceptions.JedisException: Could not return the resource to the pool
          at redis.clients.util.Pool.returnResourceObject(Pool.java:65)
          at redis.clients.jedis.JedisSentinelPool.returnResource(JedisSentinelPool.java:221)

      临时解决方法

      jedisSentinelPool.returnBrokenResource(jedis);
      jedis=null; //这时不会实际执行returnResource中的相关动作了

      但不建议这样处理,更严谨的释放资源方法见前文所述。

      以上就是使用Jedis线程池returnResource异常注意事项的详细内容,更多关于Jedis线程池returnResource异常的资料请关注其它相关文章!