Optionally, a Partitioner for key-value RDDs (eg to say that the RDD is hash-partitioned) Optional: for key, value pair RDD, there is a partition function
Optionally, a list of preferred locations to compute each split on (eg block locations for an HDFS file) Mobile computing is cheaper than mobile data. If the file is on which server, start the task on which server to perform the calculation, and try to avoid data copying