Implementation of Ad Index Retrieval Engine Based on BitSet

 

 It is not easy to write, please indicate when reprinting (http://shihlei.iteye.com/blog/2358063)

 

I. Overview

In the advertising system, when an advertising campaign is created, the operator usually sets the basic orientation of the advertisement according to the audience of the advertisement. For example, the promotion of Chanel needs to be placed on the female users in Shanghai.

 

Therefore, indexing and retrieving ad campaigns based on fixation criteria is an essential feature of the delivery engine.

 

Usually implementations can use an indexing engine like ElasticSearch. This article attempts to implement a simple BitMap-based in-memory index and retrieval engine.       

 

Two ideas

Index: Build a BitSet for each targeting condition, and create an index in this targeting condition, which is equivalent to setting the BitSet's advertising campaign ID to 1;

Retrieval: Provide the directional set, take out the BitSet with the specified orientation, use the "AND" operation between the BitSets, take the intersection to obtain the final BitSet, and obtain the subscript whose bit is 1 in the final BitSet is the bitmap that satisfies the condition.

 

Advantages: In-memory index and small memory space.

shortcoming:

1) When the BitSet has a large number of bits, the & operation will be slower, which needs further study.

2) Campaign ID must be an integer.

3) When there are many orientation conditions, it is still difficult to use.

4) There will be an "OR" operation between the orientation conditions, which is not considered here.

        Others did not think deeply.

 

Three realizations

public class AdvEngine {

    private static final int BITSET_SIZE = 100000000;

    //index list
    private Map<TargetEnum, BitSet> indexes = new HashMap<>();


    public static void main(String[] args) {
        AdvEngine advEngine = new AdvEngine();

        //Ad 1000 Placement: Beijing, female
        advEngine.index(1000, Arrays.asList(TargetEnum.GENDER_FEMALE, TargetEnum.AREA_BEIJING));
        //Advertising in 2000: Shanghai, female
        advEngine.index(2000, Arrays.asList(TargetEnum.GENDER_FEMALE, TargetEnum.AREA_SHAGNHAI));
        //Ad 3000 delivery: female
        advEngine.index(3000, Arrays.asList(TargetEnum.GENDER_FEMALE));
        //Ad 4000 delivery: Shanghai, male
        advEngine.index(4000, Arrays.asList(TargetEnum.GENDER_MALE, TargetEnum.AREA_SHAGNHAI));

        System.out.println("Shanghai, women can place ads: ");
        List<Integer> campaingIds = advEngine.search(Arrays.asList(TargetEnum.GENDER_FEMALE, TargetEnum.AREA_SHAGNHAI));
        campaingIds.stream().forEach(System.out::println);

        System.out.println("Women can place ads: ");
        campaingIds = advEngine.search(Arrays.asList(TargetEnum.GENDER_FEMALE));
        campaingIds.stream().forEach(System.out::println);
    }


    /**
     * create index
     *
     * @param campaignId ad id
     * @param targets targeting type
     */
    public void index(int campaignId, List<TargetEnum> targets) {
        for (TargetEnum target : targets) {
            BitSet bitSet = indexes.get(target);
            if (bitSet == null) {
                bitSet = new BitSet(BITSET_SIZE);
                indexes.put(target, bitSet);
            }
            bitSet.set(campaignId, true);
        }
    }


    /**
     * Find campaigns that match your type
     *
     * @param targets targeting type
     */
    public List<Integer> search(List<TargetEnum> targets) {
        List<Integer> campaignIds = new LinkedList<>();
        if (targets.isEmpty()) {
            return campaignIds;
        }

        BitSet finalSet = null;

        // Take out the set that satisfies all orientation conditions
        long start = System.nanoTime();
        for (TargetEnum target : targets) {
            BitSet campaingBitSet = indexes.get(target);
            if (campaingBitSet == null) {
                break;
            }
            if (finalSet == null) {
                finalSet = (BitSet) campaingBitSet.clone();
            } else {
                finalSet.and(campaingBitSet);
            }
        }
        long end = System.nanoTime();
        System.out.println("time1 : " + (end - start));
        if (finalSet == null) {
            return campaignIds;
        }

        long start2 = System.nanoTime ();
        for (int i = 0; i < finalSet.length(); i++) {
            if (!finalSet.get(i)) {
                continue;
            }
            campaignIds.add(i);
        }
        long end2 = System.nanoTime ();
        System.out.println("time2 : " + (end2-start2));

        return campaignIds;
    }

    /**
     * delete index
     *
     * @param campaingId advertising ID
     */
    public void dropIndex(int campaingId) {
        // Take out the set that satisfies all orientation conditions
        for (BitSet campaingBitSet : indexes.values()) {
            campaingBitSet.clear(campaingId);
        }
    }


    /**
     * Directed type enumeration
     */
    public static enum TargetEnum {
        //gender orientation
        GENDER_MALE,
        GENDER_FEMALE,
        //Geographic targeting
        AREA_BEIJING,
        AREA_SHAGNHAI;
    }

}

 

Four BitSet Description

(1) Set bit

void set(int bitIndex)

 

(2) Get bit

boolean get(int bitIndex)

 

(3) Operation

void and(BitSet set) //与

void or(BitSet set)  //或

void xor(BitSet set) //异或

 

(4) Clear bit

void clear(int bitIndex)

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326408702&siteId=291194637