Flink distributed cache

distributed cache

  • Flink provides a distributed cache, similar to hadoop, that allows users to easily read local files in parallel functions and put them in the taskmanager node to prevent repeated task pulls. The working mechanism of this cache is as follows: the program registers a file or directory (local or remote file system, such as hdfs or s3), registers the cache file through ExecutionEnvironment and gives it a name. When the program is executed, Flink automatically copies the file or directory to the local file system of all taskmanager nodes, which will only be executed once. Users can find the file or directory by this specified name, and then access it from the local file system of the taskmanager node.

combat

class CacheMap extends RichMapFunction<String, String> {
    
    
    private ArrayList<String> dataList = new ArrayList<String>();

    @Override
    public void open(Configuration parameters) throws Exception {
    
    
        super.open(parameters);
        //2:使用文件
        File myFile = getRuntimeContext().getDistributedCache().getFile("test.txt");
        List<String> lines = FileUtils.readLines(myFile);
        for (String line : lines) {
    
    
            this.dataList.add(line);
            System.err.println("分布式缓存为:" + line);
        }
    }

    @Override
    public String map(String value) throws Exception {
    
    
        //在这里就可以使用dataList
        System.err.println("使用datalist:" + dataList + "------------" + value);
        //业务逻辑
        return dataList + ":" + value;
    }
}

public class DisCacheTest {
    
    

    public static void main(String[] args) throws Exception {
    
    

        //获取运行环境
        ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();

        //1:注册一个文件,可以使用hdfs上的文件 也可以是本地文件进行测试
        env.registerCachedFile("hdfs://cloud:9820/upload/test.txt","test.txt");

        DataSource<String> data = env.fromElements("a", "b", "c", "d");

        DataSet<String> result = data.map(new CacheMap());

        result.printToErr();
    }

}

Guess you like

Origin blog.csdn.net/wolfjson/article/details/118601615