I am new to Java 8 and trying a requirement on Streams. I have a csv file with thousands of recods my csv format is
DepId,GrpId,EmpId,DepLocation,NoofEmployees,EmpType === D100,CB,244340,USA,1000,Contract D101,CB,543126,USA,1900,Permanent D101,CB,356147,USA,1800,Contract D100,DB,244896,HK,500,SemiContract D100,DB,543378,HK,100,Permanent
My requirement is to filter the records with two conditions a) EmpId starts with "244" or EmpId starts with "543" b) EmpType is "Contract" and "Permanent"
I tried below
try (Stream<String> stream = Files.lines(Paths.get(fileAbsolutePath))) {
list = stream
.filter(line -> line.contains("244") || line.contains("543"))
.collect(Collectors.toList());
}
It is filtering the employees based on 244 and 543 but my concern is since i am using contains it might fetch other data also i.e. it will fetch the data not only from EmpId column but also from other columns(other columns might also have data starting with these numbers)
similarly to incorporate EmpType as i am reading line by line there is no way for me to enforce that EmpType should be in "Permanent" and "Contract"
Am i missing any advanced options??
You can do it like so,
Pattern comma = Pattern.compile(",");
Pattern empNum = Pattern.compile("(244|543)\\d+");
Pattern empType = Pattern.compile("(Contract|Permanent)");
try (Stream<String> stream = Files.lines(Paths.get("C:\\data\\sample.txt"))) {
List<String> result = stream.skip(2).map(l -> comma.split(l))
.filter(s -> empNum.matcher(s[2]).matches())
.filter(s -> empType.matcher(s[5]).matches())
.map(s -> Arrays.stream(s).collect(Collectors.joining(",")))
.collect(Collectors.toList());
System.out.println(result);
} catch (IOException e) {
e.printStackTrace();
}
First read the file and skip 2 header lines. Then split it using the ,
character. Filter it out using EmpId
and EmpType
. Next, merge the tokens back again to form the line, and finally collect each line into a List
.