Replacing elements in long list Python

Chops :

I am trying to replace a number of elements (around 40k) from a much larger list (3 million elements) with a tag. The code below works however it is extremely slow.

def UNKWords(words):
    words = Counter(words)
    wordsToBeReplaced = []
    for key, value in words.items():
        if(value == 1):
            wordsToBeReplaced.append(key)
    return wordsToBeReplaced

wordsToBeReplaced = UNKWords(trainingWords)

replacedWordsList = ["<UNK>" if word in wordsToBeReplaced else word for word in trainingWords]

Is there a more efficient way of replacing elements in such a large list?

blhsing :

You can make wordsToBeReplaced a set instead so that lookups can be done in constant time on average rather than linear time:

def UNKWords(words):
    return {word for word, count in Counter(words).items() if count == 1}

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=25626&siteId=1