Java selenium has no interface to crawl pages that need to be logged in + verification code interception + verification code recognition python tensorflow

1. To use PhantomJSDriver as a non-interface browser plug-in, you first need to enter the page and use the following method to take a screenshot

//selenium screenshot
File screenshotAs = ((TakesScreenshot) phantomJSDriver).getScreenshotAs(OutputType.FILE); 


2. Find the location of the picture for further screenshots
List<WebElement> findElement = phantomJSDriver.findElements(By.xpath(imageHtmlTagXpath));
WebElement webElement = findElement.get (0);
int x = webElement.getLocation (). getX ();
int y = webElement.getLocation (). getY ();
int width = webElement.getSize().getWidth();
int height = webElement.getSize().getHeight();


//Intercept the picture and identify the verification code through python tensorflow
String imageAndRecognize = ImageDownLoadTool.cut(screenshotAs.getPath(), x, y, width, height);

/**
* Crop the picture and save the cropped new picture.

* @return
*/
public static String cut(String fisName, int x, int y, int width, int height)
throws IOException {
String uuidImage = UUID.randomUUID().toString();
String imagePath = "D:\\images\\instanceImageCode\\" + uuidImage
+ ".jpeg";
FileInputStream is = null;
ImageInputStream iis = null;


try {
// read image file
Image src = Toolkit.getDefaultToolkit().getImage(fisName);
BufferedImage image = toBufferedImage(src);// Image to BufferedImage
BufferedImage out = image.getSubimage(x, y, width, height);
Graphics graphics = image.getGraphics();
graphics.drawImage(image, 0, 0, null);
graphics.dispose();
ImageIO.write(out, "jpeg", new File(imagePath));


} finally {
if (is != null)
is.close();
if (iis != null)
iis.close();
}


String pyFilePath = "D:\\PythonWorkPlace\\captcha_recognize-master\\java_recognize.py";
Object result = JavaInvokePython.getInstance()
.invokePythonScriptByStream(pyFilePath, imagePath);
if (result != null) {
String str = (String) result;
str = str.replaceAll("[\'\\[\\]]", "");
return str;
}
return imagePath;

}

There are two ways to call python script from java

      The first (can be called from multiple threads)

public Object invokePythonScriptByStream(String pythonFilePath,
String... filePath) {
String result = "";
String[] arg1 = new String[] { "python ", pythonFilePath };
String[] addAll = ArrayUtils.addAll(arg1, filePath);
try {
ProcessBuilder pb = new ProcessBuilder(addAll);
Process process = pb.start();
process.waitFor();
InputStreamReader ir = new InputStreamReader(
process.getInputStream());
LineNumberReader input = new LineNumberReader(ir);
result = input.readLine();
input.close();
ir.close();
process.waitFor();
} catch (Exception e) {
System.out.println("python call exception" + e.getMessage());
}
System.out.println("python call succeeded" + result);
return result;
}


The second: cannot make multi-threaded calls

public Object invokePythonScriptByStream(String pythonFilePath,
String... filePath) {
String result = "";
String[] arg1 = new String[] { "python ", pythonFilePath };
String[] addAll = ArrayUtils.addAll(arg1, filePath);
try {
Process process = Runtime.getRuntime().exec(addAll);
process.waitFor();
InputStreamReader ir = new InputStreamReader(
process.getInputStream());
LineNumberReader input = new LineNumberReader(ir);
result = input.readLine();
input.close();
ir.close();
process.waitFor();
} catch (Exception e) {
System.out.println("python call exception" + e.getMessage());
}
System.out.println("python call succeeded" + result);
return result;
}


Example of calling:

public static void main1(String[] args) {
int count = 0;
File fl = new File(
"D:\\PythonWorkPlace\\captcha_recognize-master\\data\\test_data\\");
String[] files = fl.list();
File f = null;
for (String file : files) {
String filename = "";
f = new File(fl, file);
filename = f.getAbsolutePath();
System.out.println(filename);
count++;
Object result = JavaInvokePython.getInstance()
.invokePythonScriptByStream(pyFilePath, filename);
if (result != null) {
String str = (String) result;
str = str.replaceAll("[\'\\[\\]]", "");
if (filename.indexOf(str) != -1) {
System.out.println("匹配成功");
}
}
}
}


Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326264226&siteId=291194637
Recommended