为复合对象编写比较器以进行二分搜索

4

我有一个类和实例列表,大致如下(字段名称已更改以保护隐私/专有信息):

public class Bloat
{
    public long timeInMilliseconds;
    public long spaceInBytes;
    public long costInPennies;
}

public class BloatProducer
{
    final private List<Bloat> bloatList = new ArrayList<Bloat>();
    final private Random random = new Random();
    public void produceMoreBloat()
    {
       int n = bloatList.size();
       Bloat previousBloat = (n == 0) ? new Bloat() : bloatList.get(n-1);
       Bloat newBloat = new Bloat();
       newBloat.timeInMilliseconds = 
          previousBloat.timeInMilliseconds + random.nextInt(10) + 1;
       newBloat.spaceInBytes = 
          previousBloat.spaceInBytes + random.nextInt(10) + 1;
       newBloat.costInPennies = 
          previousBloat.costInPennies + random.nextInt(10) + 1;
       bloatList.add(newBloat);
    }
    /* other fields/methods */

    public boolean testMonotonicity()
    {
    Bloat previousBloat = null;
    for (Bloat thisBloat : bloatList)
            {
               if (previousBloat != null)
               {
                  if ((previousBloat.timeInMilliseconds 
                     >= thisBloat.timeInMilliseconds)
                   || (previousBloat.spaceInBytes 
                     >= thisBloat.spaceInBytes)
                   || (previousBloat.costInPennies
                     >= thisBloat.costInPennies))
                       return false;
               }
               previousBloat = thisBloat;
           }
           return true;
    }

BloatProducer bloatProducer;

列表bloatListBloatProducer在内部维护,以这样的方式维护它,即仅附加新的Bloat记录,不修改任何旧记录,并且每个字段都是单调递增的,例如bloatProducer.testMonotonicity()将始终返回true
我想使用Collections.binarySearch(list,key,comparator)通过timeInMillisecondsspaceInBytescostInPennies字段搜索Bloat记录。(如果数字在两个记录之间,我想找到前一个记录)
最简单的方法是编写一系列3个比较器类来使其工作吗?我是否必须使用一个带有虚拟字段的Bloat对象作为关键字来进行搜索?

2
二分查找只有在集合已排序的情况下才能正常工作,这意味着需要将一个新的已排序集合包装在您现有的列表周围。如果您的集合非常大,则需要另一种搜索方式。当然,集合可以引用相同的对象,因此“大”可能并不像您想象的那样糟糕。 - Yishai
它按照我关心的所有字段进行了排序,所以我很幸运。 - Jason S
考虑过内存中的SQL服务器吗?它将为您提供索引功能以及插入等操作。 - akarnokd
不,这是针对三个固定单调字段的;如果更复杂,我会做一些类似的事情。 - Jason S
5个回答

5

您需要为每个想要比较的字段编写单独的比较器:

public class BloatTimeComparator implements Comparator<Bloat> {
    public int compare(Bloat bloat1, Bloat bloat2) {
        if (bloat1.timeInMilliseconds > bloat2.timeInMilliseconds) {
            return 1;
        } else if (bloat1.timeInMilliseconds < bloat2.timeInMilliseconds) {
            return -1;
        } else {
            return 0;
        }
    }
}

对于您想在Bloat的每个属性上进行比较的内容(您需要为每个属性创建一个比较器类),请按照以下步骤操作。然后使用Collections助手方法:

Collections.binarySearch(bloatList,  bloatObjectToFind, 
    new BloatTimeComparator());

从二分搜索方法的Java文档中得知,返回值将是:
如果键包含在列表中,则为其索引; 否则为 (-(插入点) - 1)。 插入点被定义为将键插入列表的点:大于键的第一个元素的索引,或者如果列表中的所有元素都小于指定键,则为list.size()。 注意,这保证了仅当找到键时返回值才会 > = 0。
这就是您指定所需的索引。

如果您在compare()方法中使用了装箱后的Long类型而不是原始类型,您只需使用return bloat1.timeInMilliseconds.compareTo(bloat2.timeInMilliseconds);即可避免使用if语句,这可能会更加美观。叹气希望Long有静态的compare()方法。 - Grundlefleck

2

如果您想按照这3个属性进行搜索,那么您需要有3个单独的Comparator

更简洁的选项是使用通用的Comparator,它接收一个参数,告诉它按哪个字段进行比较。

基本的通用比较器应该长这样:

public class BloatComparator implements Comparator<Bloat>
{
    CompareByEnum field;

    public BloatComparator(CompareByEnum field) {
        this.field = field;
    }

    @Override
    public int compare(Bloat arg0, Bloat arg1) {
        if (this.field == CompareByEnum.TIME){
            // compare by field time
        }
        else if (this.field == CompareByEnum.SPACE) {
            // compare by field space
        }
        else {
            // compare by field cost
        }
    }
}

1
这是一种以测试为驱动的编写第一个比较器的方法。
public class BloatTest extends TestCase{
    public class Bloat {
        public long timeInMilliseconds;
        public long spaceInBytes;
        public long costInPennies;

        public Bloat(long timeInMilliseconds, long spaceInBytes, long costInPennies) {
            this.timeInMilliseconds = timeInMilliseconds;
            this.spaceInBytes = spaceInBytes;
            this.costInPennies = costInPennies;
        }
    }

    public void testMillisecondComparator() throws Exception {
        Bloat a = new Bloat(5, 10, 10);
        Bloat b = new Bloat(3, 12, 12);
        Bloat c = new Bloat(5, 12, 12);

        Comparator<Bloat> comparator = new MillisecondComparator();
        assertTrue(comparator.compare(a, b) > 0);
        assertTrue(comparator.compare(b, a) < 0);
        assertEquals(0, comparator.compare(a, c));
    }

    private static class MillisecondComparator implements Comparator<Bloat> {
        public int compare(Bloat a, Bloat b) {
            Long aTime = a.timeInMilliseconds;
            return aTime.compareTo(b.timeInMilliseconds);
        }
    }


}

0

测试程序(MultiBinarySearch.java)以查看这些想法是否正常工作(它们似乎是):

package com.example.test;

import java.util.ArrayList;
import java.util.Collections;
import java.util.Comparator;
import java.util.List;
import java.util.Random;

class Bloat
{
    final public long timeInMilliseconds;
    final public long spaceInBytes;
    final public long costInPennies;
    static final private int N = 100; 
    public Bloat(long l1, long l2, long l3) {
        timeInMilliseconds = l1;
        spaceInBytes = l2;
        costInPennies = l3; 
    }
    public Bloat() { this(0,0,0); }
    public Bloat moreBloat(Random r)
    {
        return new Bloat(
                timeInMilliseconds + r.nextInt(N) + 1,
                spaceInBytes + r.nextInt(N) + 1,
                costInPennies + r.nextInt(N) + 1
        );
    }
    public String toString() {
        return "[bloat: time="+timeInMilliseconds
            +", space="+spaceInBytes
            +", cost="+costInPennies
            +"]";
    }

    static int compareLong(long l1, long l2)
    {
        if (l2 > l1)
            return -1;
        else if (l1 > l2)
            return 1;
        else
            return 0;
    }

    public static class TimeComparator implements Comparator<Bloat> {
        public int compare(Bloat bloat1, Bloat bloat2) {
            return compareLong(bloat1.timeInMilliseconds, bloat2.timeInMilliseconds);
        }
    }
    public static class SpaceComparator implements Comparator<Bloat> {
        public int compare(Bloat bloat1, Bloat bloat2) {
            return compareLong(bloat1.spaceInBytes, bloat2.spaceInBytes);
        }
    }
    public static class CostComparator implements Comparator<Bloat> {
        public int compare(Bloat bloat1, Bloat bloat2) {
            return compareLong(bloat1.costInPennies, bloat2.costInPennies);
        }
    }
    enum Type { 
        TIME(new TimeComparator()), 
        SPACE(new SpaceComparator()),
        COST(new CostComparator());

        public Comparator<Bloat> comparator;
        Type(Comparator<Bloat> c) { this.comparator = c; } 
    } 
}

class BloatProducer
{
    final private List<Bloat> bloatList = new ArrayList<Bloat>();
    final private Random random = new Random();
    public void produceMoreBloat()
    {
        int n = bloatList.size();
        Bloat newBloat = 
            (n == 0) ? new Bloat() : bloatList.get(n-1).moreBloat(random);
            bloatList.add(newBloat);
    }
    /* other fields/methods */

    public boolean testMonotonicity()
    {
        Bloat previousBloat = null;
        for (Bloat thisBloat : bloatList)
        {
            if (previousBloat != null)
            {
                if ((previousBloat.timeInMilliseconds 
                        >= thisBloat.timeInMilliseconds)
                    || (previousBloat.spaceInBytes 
                        >= thisBloat.spaceInBytes)
                    || (previousBloat.costInPennies
                        >= thisBloat.costInPennies))
                    return false;
            }
            previousBloat = thisBloat;
        }
        return true;
    }
    public int searchBy(Bloat.Type t, Bloat key)
    {
        return Collections.binarySearch(bloatList, key, t.comparator);
    }
    public void showSearch(Bloat.Type t, Bloat key)
    {
        System.out.println("Search by "+t+": "); 
        System.out.println(key);
        int i = searchBy(t,key);
        if (i >= 0)
        {
            System.out.println("matches");
            System.out.println(bloatList.get(i));
        }
        else
        {
            System.out.println("is between");
            i = -i-1;
            Bloat b1 = (i == 0) ? null : bloatList.get(i-1);
            System.out.println(b1);
            Bloat b2 = (i >= bloatList.size()) ? null : bloatList.get(i);
            System.out.println("and");
            System.out.println(b2);
        }
    }
}

public class MultiBinarySearch {
    private static int N = 1000;
    public static void main(String[] args)
    {
        BloatProducer bloatProducer = new BloatProducer();
        for (int i = 0; i < N; ++i)
        {
            bloatProducer.produceMoreBloat();
        }

        System.out.println("testMonotonicity() returns "+
                bloatProducer.testMonotonicity());
        Bloat key;
        key = new Bloat(10*N, 20*N, 30*N);
        bloatProducer.showSearch(Bloat.Type.COST, key);
        bloatProducer.showSearch(Bloat.Type.SPACE, key);
        bloatProducer.showSearch(Bloat.Type.TIME, key);
        key = new Bloat(-10000, 0, 1000*N);
        bloatProducer.showSearch(Bloat.Type.COST, key);
        bloatProducer.showSearch(Bloat.Type.SPACE, key);
        bloatProducer.showSearch(Bloat.Type.TIME, key);
    }
}

枚举方式加1。现在很清楚了。每个字段都具有单调性,例如(1, 1, 1) < (2, 2, 2)。 - akarnokd

0

如果您想利用二分查找来处理这三个属性,您需要为它们创建比较器,并拥有额外的列表或TreeSet,通过比较器进行排序。


正确,我猜我的问题是如何编写比较器... 我不想维护额外的数据结构,因为在我的实际应用中它们相当大。 - Jason S
有多大?10^6还是更大?你只有相邻数组中的引用,而不是元素的精确副本。 - akarnokd
哦,我明白了,这是对象的 List<>,而不是值的列表。 - Jason S
仍然,由于在我的应用程序中字段是单调的,我不需要额外的列表。 - Jason S

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接