我怎样才能找到带有Mathematica的Waldo?

本文翻译自:How do I find Waldo with Mathematica?

This was bugging me over the weekend: What is a good way to solve those Where's Waldo? 周末这让我很烦恼:什么是解决那些Waldo的好方法 [ 'Wally' outside of North America] puzzles, using Mathematica (image-processing and other functionality)? [ 'Wally'在北美之外]使用Mathematica(图像处理和其他功能)进行拼图?

Here is what I have so far, a function which reduces the visual complexity a little bit by dimming some of the non-red colors: 这是我到目前为止的功能,它通过调暗一些非红色来减少视觉复杂度:

whereIsWaldo[url_] := Module[{waldo, waldo2, waldoMask},
    waldo = Import[url];
    waldo2 = Image[ImageData[
        waldo] /. {{r_, g_, b_} /;
          Not[r > .7 && g < .3 && b < .3] :> {0, 0,
          0}, {r_, g_, b_} /; (r > .7 && g < .3 && b < .3) :> {1, 1,
          1}}];
    waldoMask = Closing[waldo2, 4];
    ImageCompose[waldo, {waldoMask, .5}]
]

And an example of a URL where this 'works': 以及这个“有效”的网址示例:

whereIsWaldo["http://www.findwaldo.com/fankit/graphics/IntlManOfLiterature/Scenes/DepartmentStore.jpg"]

(Waldo is by the cash register): (Waldo是收银台):

Mathematica图形


#1楼

参考:https://stackoom.com/question/ZZn0/我怎样才能找到带有Mathematica的Waldo


#2楼

I have a quick solution for finding Waldo using OpenCV. 我有一个使用OpenCV查找Waldo的快速解决方案。

I used the template matching function available in OpenCV to find Waldo. 我使用OpenCV中提供的模板匹配功能来查找Waldo。

To do this a template is needed. 为此,需要一个模板。 So I cropped Waldo from the original image and used it as a template. 所以我从原始图像裁剪了Waldo并将其用作模板。

在此输入图像描述

Next I called the cv2.matchTemplate() function along with the normalized correlation coefficient as the method used. 接下来,我将cv2.matchTemplate()函数与规范化的相关系数一起称为所使用的方法。 It returned a high probability at a single region as shown in white below (somewhere in the top left region): 它在单个区域返回的概率很高,如下面的白色所示(左上角区域的某处):

在此输入图像描述

The position of the highest probable region was found using cv2.minMaxLoc() function, which I then used to draw the rectangle to highlight Waldo: 使用cv2.minMaxLoc()函数找到最高可能区域的位置,然后我用它绘制矩形以突出显示Waldo:

在此输入图像描述


#3楼

I've found Waldo! 我找到了沃尔多!

沃尔多被发现了

How I've done it 我是怎么做到的

First, I'm filtering out all colours that aren't red 首先,我过滤掉所有不是红色的颜色

waldo = Import["http://www.findwaldo.com/fankit/graphics/IntlManOfLiterature/Scenes/DepartmentStore.jpg"];
red = Fold[ImageSubtract, #[[1]], Rest[#]] &@ColorSeparate[waldo];

Next, I'm calculating the correlation of this image with a simple black and white pattern to find the red and white transitions in the shirt. 接下来,我正在计算这个图像与简单的黑白图案的相关性,以找到衬衫中的红色和白色过渡。

corr = ImageCorrelate[red, 
   Image@Join[ConstantArray[1, {2, 4}], ConstantArray[0, {2, 4}]], 
   NormalizedSquaredEuclideanDistance];

I use Binarize to pick out the pixels in the image with a sufficiently high correlation and draw white circle around them to emphasize them using Dilation 我使用Binarize来选择具有足够高相关性的图像中的像素,并在它们周围绘制白色圆圈以使用Dilation强调它们

pos = Dilation[ColorNegate[Binarize[corr, .12]], DiskMatrix[30]];

I had to play around a little with the level. 我不得不在水平上玩一点。 If the level is too high, too many false positives are picked out. 如果水平太高,则挑选出太多误报。

Finally I'm combining this result with the original image to get the result above 最后,我将这个结果与原始图像结合起来得到上面的结果

found = ImageMultiply[waldo, ImageAdd[ColorConvert[pos, "GrayLevel"], .5]]

#4楼

My guess at a "bulletproof way to do this" (think CIA finding Waldo in any satellite image any time, not just a single image without competing elements, like striped shirts)... I would train a Boltzmann machine on many images of Waldo - all variations of him sitting, standing, occluded, etc.; 我猜测“做到这一点的防弹方式”(想想CIA随时在任何卫星图像中找到Waldo,而不仅仅是没有竞争元素的单个图像,如条纹衬衫)......我会在Waldo的许多图像上训练Boltzmann机器 - 他坐着,站立,闭塞等各种变化; shirt, hat, camera, and all the works. 衬衫,帽子,相机和所有的作品。 You don't need a large corpus of Waldos (maybe 3-5 will be enough), but the more the better. 你不需要大量的Waldos(也许3-5就够了),但越多越好。

This will assign clouds of probabilities to various elements occurring in whatever the correct arrangement, and then establish (via segmentation) what an average object size is, fragment the source image into cells of objects which most resemble individual people (considering possible occlusions and pose changes), but since Waldo pictures usually include a LOT of people at about the same scale, this should be a very easy task, then feed these segments of the pre-trained Boltzmann machine. 这会将概率云分配给以正确排列方式出现的各种元素,然后建立(通过分割)平均对象大小是什么,将源图像分割成最类似于个体的对象的单元格(考虑可能的遮挡和姿势变化) ),但由于Waldo图片通常包含大量相同比例的人,这应该是一项非常简单的任务,然后为预先训练好的Boltzmann机器提供这些部分。 It will give you probability of each one being Waldo. 它会给你每个人成为Waldo的概率。 Take one with the highest probability. 以最高概率获得一个。

This is how OCR, ZIP code readers, and strokeless handwriting recognition work today. 这就是今天OCR,邮政编码阅读器和无笔划手写识别的工作原理。 Basically you know the answer is there, you know more or less what it should look like, and everything else may have common elements, but is definitely "not it", so you don't bother with the "not it"s, you just look of the likelihood of "it" among all possible "it"s you've seen before" (in ZIP codes for example, you'd train BM for just 1s, just 2s, just 3s, etc., then feed each digit to each machine, and pick one that has most confidence). This works a lot better than a single neural network learning features of all numbers. 基本上你知道答案就在那里,你或多或少知道它应该是什么样的,其他一切都可能有共同的元素,但绝对是“不是它”,所以你不要打扰“不是它”,你只看一下你之前看过的所有可能“它”中“它”的可能性“(例如,在邮政编码中,你只训练BM仅1s,只需2s,只需3s等,然后喂每个数字到每台机器,并选择一个最有信心的。)这比所有数字的单个神经网络学习功能好很多。


#5楼

I don't know Mathematica . 我不知道Mathematica。 . . too bad. 太糟糕了。 But I like the answer above, for the most part. 但我最喜欢上面的答案。

Still there is a major flaw in relying on the stripes alone to glean the answer (I personally don't have a problem with one manual adjustment). 还有在依靠条纹独自搜集的答案(我个人没有一个手动调节的问题)的一大缺陷。 There is an example (listed by Brett Champion, here ) presented which shows that they, at times, break up the shirt pattern. 有一个例子(由Brett Champion列出, 这里 )显示他们有时打破了衬衫模式。 So then it becomes a more complex pattern. 那么它就变成了一个更复杂的模式。

I would try an approach of shape id and colors, along with spacial relations. 我会尝试一种形状id和颜色的方法,以及空间关系。 Much like face recognition, you could look for geometric patterns at certain ratios from each other. 就像人脸识别一样,你可以在一定比例下寻找几何图案。 The caveat is that usually one or more of those shapes is occluded. 需要注意的是,通常这些形状中的一个或多个被遮挡。

Get a white balance on the image, and red a red balance from the image. 在图像上获得白平衡,并从图像中红色显示红色。 I believe Waldo is always the same value/hue, but the image may be from a scan, or a bad copy. 我相信Waldo总是具有相同的值/色调,但图像可能来自扫描或坏的副本。 Then always refer to an array of the colors that Waldo actually is: red, white, dark brown, blue, peach, {shoe color}. 然后总是参考Waldo实际上的颜色数组:红色,白色,深棕色,蓝色,桃色,{鞋色}。

There is a shirt pattern, and also the pants, glasses, hair, face, shoes and hat that define Waldo. 有一种衬衫图案,以及定义Waldo的裤子,眼镜,头发,面部,鞋子和帽子。 Also, relative to other people in the image, Waldo is on the skinny side. 而且,相对于图像中的其他人,Waldo处于瘦弱的一面。

So, find random people to obtain an the height of people in this pic. 因此,找到随机的人来获得这张照片中的人的高度。 Measure the average height of a bunch of things at random points in the image (a simple outline will produce quite a few individual people). 测量图像中随机点的一堆东西的平均高度(简单的轮廓将产生相当多的个人)。 If each thing is not within some standard deviation from each other, they are ignored for now. 如果每件事物彼此之间没有标准偏差,则暂时忽略它们。 Compare the average of heights to the image's height. 将高度的平均值与图像的高度进行比较。 If the ratio is too great (eg, 1:2, 1:4, or similarly close), then try again. 如果比例太大(例如,1:2,1:4,或类似地接近),则再试一次。 Run it 10(?) of times to make sure that the samples are all pretty close together, excluding any average that is outside some standard deviation. 运行10(?)次以确保样本非常接近,排除任何超出某些标准偏差的平均值。 Possible in Mathematica? 可能在Mathematica?

This is your Waldo size. 这是你的Waldo尺寸。 Walso is skinny, so you are looking for something 5:1 or 6:1 (or whatever) ht:wd. Walso很瘦,所以你正在寻找5:1或6:1(或其他)的东西:wd。 However, this is not sufficient. 但是,这还不够。 If Waldo is partially hidden, the height could change. 如果Waldo部分隐藏,高度可能会改变。 So, you are looking for a block of red-white that ~2:1. 所以,你正在寻找一块~2:1的红白色块。 But there has to be more indicators. 但必须有更多的指标。

  1. Waldo has glasses. 沃尔多有眼镜。 Search for two circles 0.5:1 above the red-white. 在红白色上方搜索0.5:1的两个圆圈。
  2. Blue pants. 蓝裤子。 Any amount of blue at the same width within any distance between the end of the red-white and the distance to his feet. 在红白色的末端和到他的脚的距离之间的任何距离内的任何数量的蓝色。 Note that he wears his shirt short, so the feet are not too close. 请注意,他穿的衬衫很短,所以脚不太靠近。
  3. The hat. 帽子。 Red-white any distance up to twice the top of his head. 红白色任何距离,最高可达头顶两倍。 Note that it must have dark hair below, and probably glasses. 请注意,下面必须有深色头发,可能还有眼镜。
  4. Long sleeves. 长袖。 red-white at some angle from the main red-white. 从主红白色的某个角度看的红白色。
  5. Dark hair. 黑头发。
  6. Shoe color. 鞋子的颜色。 I don't know the color. 我不知道颜色。

Any of those could apply. 任何这些都可以适用。 These are also negative checks against similar people in the pic -- eg, #2 negates wearing a red-white apron (too close to shoes), #5 eliminates light colored hair. 这些也是对照中类似人物的负面检查 - 例如,#2否定穿着红白色的围裙(太靠近鞋子),#5消除浅色头发。 Also, shape is only one indicator for each of these tests . 此外,形状只是每个测试的一个指标。 . . color alone within the specified distance can give good results. 在指定距离内单独使用颜色可以产生良好的效果。

This will narrow down the areas to process. 这将缩小要处理的范围。

Storing these results will produce a set of areas that should have Waldo in it. 存储这些结果将产生一组应该包含Waldo的区域。 Exclude all other areas (eg, for each area, select a circle twice as big as the average person size), and then run the process that @Heike laid out with removing all but red, and so on. 排除所有其他区域(例如,对于每个区域,选择一个两倍于一般人的大小的圆圈),然后运行@Heike布局的过程除去除红色之外的所有区域,依此类推。

Any thoughts on how to code this? 有关如何编码的任何想法?


Edit: 编辑:

Thoughts on how to code this . 关于如何编码的想法。 . . exclude all areas but Waldo red, skeletonize the red areas, and prune them down to a single point. 排除除Waldo红色之外的所有区域,对红色区域进行镂空,并将它们修剪为单个点。 Do the same for Waldo hair brown, Waldo pants blue, Waldo shoe color. 同样适用于Waldo棕色头发,Waldo裤子蓝色,Waldo鞋子颜色。 For Waldo skin color, exclude, then find the outline. 对于Waldo肤色,排除,然后找到轮廓。

Next, exclude non-red, dilate (a lot) all the red areas, then skeletonize and prune. 接下来,排除所有红色区域的非红色,扩张(很多),然后进行骨架化和修剪。 This part will give a list of possible Waldo center points. 这部分将列出可能的沃尔多中心点。 This will be the marker to compare all other Waldo color sections to. 这将是比较所有其他Waldo颜色部分的标记。

From here, using the skeletonized red areas (not the dilated ones), count the lines in each area. 从这里开始,使用镂空的红色区域(不是扩张的区域),计算每个区域的线条。 If there is the correct number (four, right?), this is certainly a possible area. 如果有正确的数字(四个,对吗?),这肯定是一个可能的区域。 If not, I guess just exclude it (as being a Waldo center . . . it may still be his hat). 如果没有,我想只是排除它(作为一个沃尔多中心......它可能仍然是他的帽子)。

Then check if there is a face shape above, a hair point above, pants point below, shoe points below, and so on. 然后检查上面是否有脸形,上方有发点,裤子指向下方,鞋点位于下方,等等。

No code yet -- still reading the docs. 还没有代码 - 仍在阅读文档。


#6楼

I agree with @GregoryKlopper that the right way to solve the general problem of finding Waldo (or any object of interest) in an arbitrary image would be to train a supervised machine learning classifier. 我同意@GregoryKlopper的意见,解决在任意图像中找到Waldo(或任何感兴趣的对象)的一般问题的正确方法是训练有监督的机器学习分类器。 Using many positive and negative labeled examples, an algorithm such as Support Vector Machine , Boosted Decision Stump or Boltzmann Machine could likely be trained to achieve high accuracy on this problem. 使用许多正面和负面标记的示例,可以训练诸如支持向量机Boosted Decision Stump或Boltzmann Machine的算法以在该问题上实现高精度。 Mathematica even includes these algorithms in its Machine Learning Framework . Mathematica甚至在其机器学习框架中包含了这些算法。

The two challenges with training a Waldo classifier would be: 培训Waldo分类器的两个挑战是:

  1. Determining the right image feature transform. 确定正确的图像特征变换。 This is where @Heike's answer would be useful: a red filter and a stripped pattern detector (eg, wavelet or DCT decomposition) would be a good way to turn raw pixels into a format that the classification algorithm could learn from. 这是@Heike的答案有用的地方:红色滤镜和剥离模式检测器(例如,小波或DCT分解)将是将原始像素转换为分类算法可以学习的格式的好方法。 A block-based decomposition that assesses all subsections of the image would also be required ... but this is made easier by the fact that Waldo is a) always roughly the same size and b) always present exactly once in each image. 还需要一个基于块的分解来评估图像的所有子部分......但是由于Waldo a)总是大致相同的大小而且b)在每个图像中始终只出现一次,因此这变得更容易。
  2. Obtaining enough training examples. 获得足够的训练样例。 SVMs work best with at least 100 examples of each class. SVM最适合每个类至少100个示例。 Commercial applications of boosting (eg, the face-focusing in digital cameras) are trained on millions of positive and negative examples. 增强的商业应用(例如,数码相机中的面部聚焦)在数百万个正面和负面示例上进行了训练。

A quick Google image search turns up some good data -- I'm going to have a go at collecting some training examples and coding this up right now! 一个快速的谷歌图像搜索提供了一些好的数据 - 我将去收集一些培训示例并立即对其进行编码!

However, even a machine learning approach (or the rule-based approach suggested by @iND) will struggle for an image like the Land of Waldos ! 然而,即使是机器学习方法(或@iND建议的基于规则的方法)也会为像沃尔多斯这样的形象而斗争

发布了0 篇原创文章 · 获赞 7 · 访问量 2万+

猜你喜欢

转载自blog.csdn.net/asdfgh0077/article/details/105214739