本书的参考资料如下:

【第一章】

  • [1] Hadoop SVN 地址:http://svn.apache.org/repos/asf/hadoop/common/branches/
  • [2] Apache官方主页:http://hadoop.apache.org/releases.html

【第二章】

  • 1. CDH3下载地址为:http://archive.cloudera.com/cdh/3/
  • 2. CDH4下载地址为:http://archive.cloudera.com/cdh4/cdh/4/
  • 3. Mesos官方网址:http://incubator.apache.org/mesos/
  • 4. Torque官方网址:http://www.adaptivecomputing.com/products/open-source/torque/
  • 5. HDFS-200:In HDFS, sync() not yet guarantees data available to the new readers
  • 6. HDFS-265:Revisit append
  • 7. HDFS-RAID:http://wiki.apache.org/hadoop/HDFS-RAID
  • 8. HDFS-503:Implement erasure coding as a layer on HDFS
  • 9. HDFS-245:Create symbolic links in HDFS
  • 10. HADOOP-4487:Security features for Hadoop
  • 11. MAPREDUCE-279:Map-Reduce 2.0
  • 12. HDFS-1052:HDFS scalability with multiple namenodes
  • 13. HDFS-1623:High Availability Framework for HDFS NN
  • 14. http://www.cloudera.com/blog/2012/01/an-update-on-apache-hadoop-1-0/
  • 15. HADOOP-6332:Large-scale Automated Test Framework
  • 16. MAPREDUCE-1084:Implementing aspects development and fault injeciton framework for MapReduce
  • 17. Spark官方首页:http://www.spark-project.org/
  • 18. YARN-3:Add support for CPU isolation/monitoring of containers
  • 19. http://wiki.apache.org/hadoop/PoweredByYarn

【第三章】

  • 1. Thrift主页:http://thrift.apache.org/
  • 2. Protocol Buffer主页:http://code.google.com/p/protobuf/
  • 3. Avro主页:http://avro.apache.org/
  • 4. HADOOP-7347:IPC Wire Compatibility
  • 5. MAPREDUCE-2930:Generate state graph from the State Machine Definition

【第四章】

  • 1. YARN-103:Add a yarn AM - RM client module
  • 2. YARN-422:Add NM client library
  • 3. YARN-314:Schedulers should allow resource requests of different sizes at the same priority and location

【第五章】

  • 1. YARN-103:Add a yarn AM - RM client module
  • 1. Haml框架主页:http://haml.info/
  • 2. MAPREDUCE-2399:The embedded web framework for MAPREDUCE-279
  • 3. YARN-128:RM Restart
  • 4. YARN-149:ResourceManager (RM) High-Availability (HA)
  • 5. YARN-353:Add Zookeeper-based store implementation for RMStateStore
  • 6. YARN-291:Dynamic resource Configuration
  • 7. HADOOP-9621:Document/analyze current security model

【第六章】

  • 1. http://hadoop.apache.org/docs/stable/hod_scheduler.html
  • 2. http://www.adaptivecomputing.com/products/open-source/torque/
  • 3. YARN-137:Change the default scheduler to the CapacityScheduler
  • 4. Dominant Resource Fairness: Fair Allocation of Multiple Resources Types. A. Ghodsi, M. Zaharia, B. Hindman, A. Konwinski, S. Shenker, and I. Stoica, NSDI 2011, March 2011.
  • 5. Max-Min Fairness (Wikipedia): http://en.wikipedia.org/wiki/Max-min fairness.
  • 6. http://hadoop.apache.org/docs/stable/capacity_scheduler.html
  • 7. http://hadoop.apache.org/docs/stable/fair_scheduler.html
  • 8. MAPREDUCE-1380:Adaptive Scheduler
  • 9. MAPREDUCE-1439:Learning Scheduler
  • 10. Thomas Sandholm and Kevin Lai. Dynamic proportional share scheduling in hadoop. In JSSPP ’10: 15th Workshop on Job Scheduling Strategies for Parallel Processing, 2010.

【第七章】

  • 1. MAPREDUCE-3143:Complete aggregation of user-logs spit out by containers onto DFS
  • 2. YARN-321:Generic application history service
  • 3. http://en.wikipedia.org/wiki/Cgroups
  • 4. http://lxc.sourceforge.net/
  • 5. YARN-2:Enhance CS to schedule accounting for both memory and cpu cores
  • 6. https://www.kernel.org/doc/Documentation/cgroups/cgroups.txt
  • 7. https://www.kernel.org/doc/Documentation/scheduler/sched-design-CFS.txt
  • 8. 《红帽企业版Linux 6资源管理指南》

【第八章】

  • 1. MAPREDUCE-2405:MR-279: Implement uber-AppMaster (in-cluster LocalJobRunner for MRv2)
  • 2. HADOOP-3245:Provide ability to persist running jobs (extend HADOOP-1876)
  • 3. MAPREDUCE-4049:plugin for generic shuffle service
  • 4. MAPREDUCE-5108:Changes needed for Binary Compatibility for MR applications via YARN

【第九章】

  • 1. Oozie主页:http://oozie.apache.org/
  • 2. Cascading主页:http://www.cascading.org/
  • 3. http://hortonworks.com/blog/apache-hive-0-11-stinger-phase-1-delivered/
  • 4. Azkaban 主页:http://data.linkedin.com/opensource/azkaban
  • 5. https://issues.apache.org/jira/browse/TEZ

【第十章】

  • 1. Storm Wiki:https://github.com/nathanmarz/storm/wiki
  • 2. Yahoo!s4:http://incubator.apache.org/s4/
  • 3. Storm实例:https://github.com/nathanmarz/storm-starter
  • 4. Storm On YARN:https://github.com/yahoo/storm-yarn
  • 5. http://spark-project.org/
  • 6. kryo序列化器:http://code.google.com/p/kryo/
  • 7. storm简介:http://www.searchtb.com/2012/09/introduction-to-storm.html
  • 8. 徐明明的博客:http://xumingming.sinaapp.com/category/storm/

【第十一章】

  • 1. Corona源代码:https://github.com/facebook/hadoop-20/tree/master/src/contrib/corona
  • 2. Under the Hood: Scheduling MapReduce jobs more efficiently with Corona

【第十二章】

  • 1. Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center. B. Hindman, A. Konwinski, M. Zaharia, A. Ghodsi, A.D. Joseph, R. Katz, S. Shenker and I. Stoica, NSDI 2011, March 2011
  • 2. Dpark:https://github.com/douban/dpark/
  • 3. Dominant Resource Fairness: Fair Allocation of Multiple Resources Types. A. Ghodsi, M. Zaharia, B. Hindman, A. Konwinski, S. Shenker, and I. Stoica, NSDI 2011, March 2011.
  • 4. Libprocess代码:https://github.com/3rdparty/libprocess

【第十三章】

  • 1. SCHWARZKOPF, M., KONWINSKI, A., ABD-EL-MALEK, M.,AND WILKES, J. Omega: flexible, scalable schedulers for large compute clusters. In Proc. EuroSys (2013).
  • 2. https://issues.apache.org/jira/browse/GIRAPH-13
  • 3. MAPREDUCE-2911:Hamster: Hadoop And Mpi on the same cluSTER
  • 4. Weave主页:http://continuuity.github.io/weave/
  • 5. Kitten主页:https://github.com/cloudera/kitten